The researchers in the paper have
investigated the multi-tier web applications’ precise approach for
vulnerability analysis with dynamic features. Dynamic analysis of web
applications has combined in this paper with static analysis to identify
vulnerabilities automatically and then generate concrete exploits as those
vulnerabilities’ proof, rather than following the strategy of strictly static
analysis.
The investigated area of this
paper is very important because, although there are various approaches exist
for analyzing the modern web applications’ security, a series of analysis
techniques is used in these approaches to identify vulnerabilities such as Cross-Site
Scripting (XSS) and SQL Injection (SQLI). However, these approaches generate
the false alarm that is one of the drawbacks of these approaches. This paper
has overcome this drawback that shows the significance of this research.
The authors have investigated the
automatic verification of vulnerabilities. A further step is taken by other
ways and with the generation of concrete exploits is also tried. But such
approaches highly utilize the methods for static analysis. Although the
application can be covered by such methods, precision is often sacrificed due
to issues regarding handling the programs concerning the complex artifacts. It
is also one of the main reasons for the generation of positive which are false.
Specifically, in the context of web applications' dynamic features, the static
analysis is quite tough. Meanwhile, content such as links and forms are often
produced on the fly. At different tiers, the authors have utilized the codes
while the impacts are quite tough to model in a static way.
Furthermore, the dynamic
execution has investigated. First of all, the component of dynamic execution
decreases the difficulty which is faced by the statistical analysis with the
revelation of run-time artifacts that don’t require to be modeled in a static
way. Meanwhile, the component of static analysis guides the counterpart which
is dynamic in maximizing application’s coverage with the analysis of application
paths and for the exercise, offering the input. Secondly, the approach of
authors concerns a very large set of application which surpasses the art’s
state significantly.
One more investigation of this
paper includes a dynamic execution component: dynamic execution component’s
ability to decrease the difficulty faced component of the static analysis is
the main player for the high scalability. With the minimal overhead setup of
analysis, the goal of authors is to enable the exploit generation which automatic
for various vulnerabilities’ classes. In order to get to this checkpoint, the
approach of authors was developed with many analysis models or template. An
attack dictionary was also included which is utilized for the instantiation of
each and every template. There are some other ways which aim to get such a
generality for the vulnerabilities’ identification as well. But the approach of
authors is extended with (a) with the application of specific and clear dynamic
techniques of analysis. Furthermore, for the identification of vulnerabilities,
the generation of automatic exploit takes place (b). NAVEX is the tool in which
the way of authors is implemented.
How do the objectives of the paper are achieved?
The objectives of this paper are
achieved by using an approach that is evaluated as well as implemented in a
tool called NAVEX, which can exploit generation to multiple classes of
vulnerabilities and to large applications, and scale the automatic
vulnerability analysis’s process.
The authors of the paper have
used different architecture and algorithms to achieve the objectives of paper. In
order to identify the vulnerable sinks, the authors have used vulnerable sink
identification, each module is separately analyzed in NAVEX. This steps’ implicit
goal is to exclude from different steps from the modules from which vulnerable
sink cannot be contained. As depicted in a graphical representation provided in
the paper, in particular, the authors have built a graph model of each code of
the module, then the paths have discovered containing data flows between sinks
and sources. It uses symbolic execution finally so that a model can be
generated of the execution as a formula as well as constraint solving to
determine the path that is exploitable potentially.
The authors have described each
of these components to achieve their goal i.e. Attack Dictionary, Graph
Construction, Graph Traversal, and Exploit String Generation. In order to
address the challenge of vulnerabilities’ discovering multiple classes, the
authors have designed NAVEX that is easily extensible to a vulnerabilities’ wide
range, such as XSS, and SQLI as well as logic vulnerabilities such as command
injection and EAR. In the step of graph construction, the authors have built a
graph model so that possible execution paths could be represented that are
symbolically executed later, in a PHP module. The graph model of the authors is
specifically based on CPGs (Code Property Graphs) that combine call graph, abstract
syntax trees (AST), data dependence graphs (DDG), and control flow graphs (CFG)
under a unique representation so that vulnerabilities could be discovered that
are modeled as graph queries. In particular, given a sink instruction and a source,
CPGs are usable to find paths of data dependency between the variables.
In order to demonstrate how sensitization
tags are assigned in NAVEX, the authors have considered the statement in
Listing 1 at line 9 in the paper. NAVEX begins by inspecting the $edition =
(int)$ POST[’edition’]’s AST so that an appropriate tag could be assigned to $ POST[’edition’]
first. Then, the sensitization status is propagated to $edition. The assigned
tag in this case to $ POST[’edition’] is San-all due to the cast, in attack
dictionary of the research, to integer operator sanitizes it for all
vulnerabilities. The variable $edition, consequently, will have the same value
in the sanitation tag. The authors have used FilterSanNodes function in which
sanitization and DB tags are used to prune out paths that so unpromising for the
exploit generation. DB tags, in particular, are utilized during the finding for
vulnerability regarding SQLI. NAVEX, for each write query, parses the query
using a SQL parser so that necessary information could be found such as columns
names and table. Then, the extracted information can be matched with the DB tag
so that constraints could be derived from the value constraints (Fdb) and columns
data types. In conjunction, these constraints are used in the next step with
the path constraints (Fpath).
In order to achieve the objective,
and to generate the concrete exploits, the authors have used NAVEX that
executes various steps as represented in Figure 3 in the paper. A dynamic
execution step at first generates the navigation graph in which possible
sequences are captured in order to execute the application modules. Next, the
authors have used the navigation graph so that execution paths could be
discovered to those modules only that contain the vulnerable sinks that the
vulnerable sink identification step has uncovered. Finally, the authors have
generated the final exploits and described each of the steps in the next
section of the paper.
What are the results of the developed approach/tool?
The results of the developed
approaches indicate that a total of 204 exploits are constructed by NAVEX, of
which 9 are on logic vulnerabilities, and 195 are on injection. The false
positives (FPs) reduced 87 percent on average through sanitization-tags-enhanced
CPG. The client-side code analysis’s inclusion for developing the navigation
graph enhanced the exploit generation’s precision by 54 percent on average. NAVEX,
on the evaluation set, was able to drill down as deep as requested by 6 HTTP to
stitch exploits together.
NAVEX was calculated by the
authors on the PHP applications of real-world with a 3.2 M SLOC’s PHP
applications and 22.7K PHP files. In Table 1 of the paper, it can be seen
easily. The criteria of authors for the selection of applications consists of
(i) determination if PHP applications which are large, complex, and population
like HotCRP (ii) NVEX’s comparison regarding the same applications of test
which were used by work of state-of-the-art in the generation of exploit (e.g.,
Chainsaw) and vulnerability analysis (e.g., RIPS). On the Ubuntu 12.04 LTS VM
with 40GB RAM and 2 cores with GHz of 2.4 each, the deployment of NAVEX took
place.
Enhanced CPG was first generated
by the authors and with it, all the applications’ exploitable paths are found.
Moving on, applications with exploitable paths were deployed by us. In the
process of deployment, there are each application's installation on a server,
the creation of login credentials for every single role, and producing the
application's population using the initial data. Through submitting the forms
and application’s navigation, production takes place. Snapshots are taken by
the authors of each every database and utilize it after each and every crawling
for the restoration of database’s original state. It should be noted that each
application’s deployment instructions prevented authors from leveraging the
automation for the evaluation of more applications. The results have indicated
that for the manual deployment, NAVEX can be utilized for the generation and
analysis for numerous applications given the limitless time.
Summary of results is provided in
the paper and according to it, 204 exploits were constructed by NAVEX, 9 were
on the vulnerabilities of logic while the remaining were on injection. The CPG
which was sanitization-tags-enhanced, it decreased the false positives by
almost 87 percent on average. The client-side code’s inclusion and analysis for
the building of a navigation graph increased the exploit generation’s precision
by almost 54 percent on average. NAVEX was successful in drilling down deeper
like 6 HTTP request on the set of evaluation.
The results from enhanced code
property graph statistics have shown that under test, the CPG construction size
and time for each application is shown in Table 2. It should be noted that all
applications’ source code and decreased runtime overhead is represented in the
enhanced graph.
The results from navigation graph
statistics have indicated that in NAVEX’s step 2, the total time is summarized
for the generation of concrete exploits. In the table provided in the paper,
the list of applications is represented in the table for which exploitable
paths were found by NAVEX. Thus, navigation behavior will not be modeled if
their exploitable path is not present. For each application, a number of
account types reflect the number of roles. 1M edges and 59K nodes are owned by
the NG. For all applications with SQLI seeds, SQLI exploits were generated by
NAVEX except SchoolMate. Three HTTP requests were recovered by crawler in
SchoolMate. 5 roles are included in this application and the author's crawler
logged in successfully. But every single time an HTTP request is sent after the
logging, execution is redirected to the page of login. It means that user
sessions are not maintained properly by the application. Thus, coverage was
decreased with the failed proceeding of a crawler. In the authors' evaluation,
such an erroneous application was selected for the comparison between other
work and NAVEX. The exploitable sinks which reported validated to be TPs or
true positives.
According to results generated
from selected SQLI Exploit, WeBid is the application for which many SQLI
exploits were created by NAVEX. In the user interface, an exploitable sink is
shown in listing 4 in UI. Messages of other users can be checked by an
authenticated user (line 3). They will be labeled as noticed (line 6). In
Listing 5 for both sinks, there is a generated exploit. Measurements
Performance and scalability have shown that for the generation of exploits and
finding of exploitable sinks, NAVEX’s performance is determined by the total
time in Figure 5 (provided in the paper). It should be noted that blue bar
illustrates the complete time of Step 1's analysis for all elements under test
for each type of vulnerability. Meanwhile, the consumed total time is recorded
by the orange bar in Step 2 for all the applications having exploitable sinks.
According to dynamic analysis coverage
done by the authors, by step 1, the number of vulnerabilities which are
identified statically, they are considered by authors as a bottom line for
assessing the second step's coverage. For 155 sinks, 105 exploits were
constructed by NAVEX. While 90 for one hundred and twenty-eight sinks XSS and 9
for vulnerabilities of 19 Ear. 68
percent is the complete coverage of the second step differentiated with
applications' complete vulnerable sinks. Results of the effect of sanitization
tags on code property graphs have shown that CPG’s enhancing with sanitization
and tags of DB on the vulnerable sinks’ total number, the effects are shown in
Figure 6. In false positives, reductions are shown with the orange bar that
illustrates the vulnerable sinks’ total number. For each type of vulnerability,
the reported vulnerable sinks’ number is decreased by almost 87 percent because
of enhancements applied on various CPGs to the important deletion of false
positives.
What did you learn from this paper?
The paper has taught me that NAVEX
currently is an automatic exploit generation system that takes the dynamic
features into account along with the modern web applications’ navigational
complexities. Moreover, this paper has added to the knowledge about NAVEX that
this system significantly outperforms prior work on the efficiency, precision,
and scalability of exploit generation. I have learned about the dynamic
execution component’s ability from this paper to decrease the difficulty faced
component of the static analysis is the main player for the high scalability.
This paper has added to knowledge regarding NAVEX and enhanced my understanding
about it; in order to address the challenge of vulnerabilities’ discovering
multiple classes, NAVEX is easily extensible to a vulnerabilities’ wide range,
such as XSS, and SQLI as well as logic vulnerabilities such as command
injection and EAR
I have learned about CPGs (Code
Property Graphs) from this paper that combines call graph, abstract syntax
trees (AST), data dependence graphs (DDG), and control flow graphs (CFG) under
a unique representation so that vulnerabilities could be discovered that are
modeled as graph queries. This paper has taught me that in the process of
deployment, there are: each application's installation on a server, the
creation of login credentials for every single role, and producing the
application’s population using the initial data. I have learned from this paper
that it should be noted that each application’s deployment instructions prevent
the researchers from leveraging the automation for the evaluation of more
applications. Moreover, this paper has taught me that CPG that is
sanitization-tags-enhanced, decreases the false positives by almost 87 percent
on average. I have learned from this paper that erroneous application can be
selected for the comparison between other work and NAVEX. In this paper, I have
learned about WeBid that is the application for which many SQLI exploits can be
created by NAVEX. One interesting thing that I came to know about in this paper
is that the closest work to NAVEX for web applications is Chainsaw that is
basically a system that, to build concrete exploits, uses purely static
analysis. I came to know about the 2 aspects in this paper by which NAVEX
differs from Chainsaw: the first one is that NAVEX performs a static analyses
and combination of dynamic that enables it to find more exploits and to better
scale to large applications; the second one is that NAVEX supports finding
exploits for multiple vulnerabilities’ classes
This paper has added to my
knowledge regarding QED that generates XSS attacks and first-order SQLI using
model checking and static analysis for Java web applications. I came to know
about navigation modeling from this paper that is basically inspired by
previous work; navigation modeling is a system that finds workflow
vulnerabilities as well as data by analyzing web applications' modules.
I have learned about Chainsaw in
this paper that supports generating exploits for SQLI and XSS that is compared
to NAVEX with respect to the total number of XSS exploits and the generated
SQLI as well as some performance measurements that is given in table number 8
in the paper. This paper has taught me that NAVEX has the ability to construct
19 different exploits in myBloggie, WeBid, FAQforge, WebChess, and geccbblite, and
achieves the same for DNscript, scarf, and Eve. This paper has taught me that
NAVEX, for SchoolMate did not create exploits because of the different issues
that are linked to maintaining users sessions. This paper has taught me that the
exploit generation in Chainsaw is done statically, it is able to generate
exploits for the application. I have learned in this paper that NAVEX should be
chosen over Chainsaw because, in terms of efficiency, NAVEX significantly
outperforms Chainsaw. Exploits in Chainsaw are generated in the 112min while 25min
and 2sec are taken by NAVEX for the same procedure in this paper. This paper
has taught me that the total time can be contrasted to search as well as build
the navigation graph in NAVEX within 18 minutes and 26 seconds with the total
time to search as well as construct the RWFG (Refined Workflow Graph) within
1day 13 hours and 21 minutes in Chainsaw. After reading the whole paper I have
learned that exploit generation efficiency can be improved by techniques used
in NAVEX without losing precision.
Write a critical commentary on the strengths and weaknesses of the work
taking into consideration previous work in the area.
The paper as a whole is full of
strengths; the use of NAVEX system is the most significant strength of this
feather. However, there are some limitations as well, for example, certain web
applications’ features are not supported yet and therefore the coverage of the
paper is limited. For example, forms having the type file’s inputs require the
user to upload as well as select the actual form of the file. This, in a given
test set in the paper, can be made to work with a solver, but more engineering
efforts are required to make this work across all platforms.
The exploit generation for web
applications is one of the strengths of the paper. In previous researches and in
the binary application, exploit generation has seen a lot of interest. The
closest work to NAVEX for web applications is Chainsaw that is basically a
system that, to build concrete exploits, uses purely static analysis. According
to the previous researches In 2 aspects, NAVEX differs from Chainsaw: the first
one is that it performs a static analysis and the combination of dynamic that
enables NAVEX to find more exploits and to better scale to large applications;
the second one is that it supports finding exploits for multiple vulnerabilities’
classes. the taint tracking and concolic execution in the research are similar
to previous work that is used to construct XSS attack vectors and SQLI; CRAXweb
that employs symbolic execution and concrete supported by a constraint solver
to generate XSS exploits and SQLI. The strength of paper includes QED that generates
XSS attacks and first-order SQLI using model checking and static analysis for
Java web applications.
The previous researches generate
inputs in order to expose SQLI vulnerabilities using PHP applications’ concolic
execution that might be a weakness of this research. In previous literature,
the researchers have combined constraint and solving static analysis so exploits
in for-crime web applications could be found; the authors of this research have
used the same procedure for another purpose. Moreover, the previous researchers
have developed exploits for parameter-tampering vulnerabilities. However, these
works in the literature did not consider whole-application paths and are limited
to a single PHP module. Modeling has performed in the previous work with code
property graphs.
The researchers in previous work
have introduced the CPGs’ notion for discovery in C programs and vulnerability
modeling. In a follow-up work, the researchers have applied CPGs for
vulnerability detection on PHP applications; this seems the weakness of this
paper as the authors have mostly focused on NAVEX. But on the other hand, the
strength of this paper is that it uses the efficiency and flexibility that is
offered CPGs, the problem of the authors goes a step further so that actual
executable exploits could be generated. The authors, as a consequence, have enhanced
CPGs with more attributes that is the strength and uniqueness of this paper.
Vulnerability analysis is done in previous literature, the research in this
area is large that examined server-side vulnerability detection. The static
analysis approaches are broadly used in the previous works, dynamic analysis
approaches have also explored in the previous literature, and only a limited
number of studies have explored hybrid approaches.
The authors in this paper
conversely have mainly focused on NAVEX. Although some of these analysis
techniques are employed by NAVEX in order to find out the vulnerabilities, the
NAVEX’s aim is different from these works as, for the identified vulnerabilities,
NAVEX constructs exploits. The authors have focused on navigation modeling in
the paper that is basically inspired by previous work; navigation modeling is a
system that finds workflow vulnerabilities as well as data by analyzing web
applications’ modules. The analyses have become advanced by NAVEX as it
combines dynamic and static analyses so that concrete exploits for large web
applications could be constructed.
Alhuzali,
A., Gjomemo, R., Eshete, B. & Venkatakrishnan, V., 2018. NAVEX: Precise
and Scalable Exploit Generation for Dynamic Web Applications. s.l., USENIX,
pp. 376-392.