ist-meic-ssof-g37

Software Security project of group 37 - MEIC @ IST 2023/2025.

Authors

@IST
Master in Computer Science and Computer Engineering
Software Security - Group 37
Winter Semester of 2024/2025

Overview

The js_analyser.py tool is a static analysis solution designed to identify vulnerabilities in JavaScript program slices. It evaluates information flows from entry points (sources) to sensitive operations (sinks) and detects the application ( or lack) of sanitization functions based on customizable vulnerability patterns.

Prerequisites

Python 3.9.2

Setup

Install the required packages:

pip install -r requirements.txt

Run the application:

python ./src/js_analyser.py <path_to_slice>.js <path_to_patterns>.json

Example:

python ./src/js_analyser.py ./test/fixtures/1-basic-flow/1a-basic-flow.js ./test/fixtures/1-basic-flow/1a-basic-flow.patterns.json

Testing

To verify functionality, run the included test suite:

pytest ./test

How the Analyzer Works

Initialization

The tool begins by parsing the input JavaScript slice into an Abstract Syntax Tree (AST) using the esprima library (check Appendix A. Syntax Tree Format for more information on the AST format). Each vulnerability pattern from the input JSON file is converted into a Pattern object, which contains sources, sinks, sanitizers, and whether implicit flows are to be tracked.

Analysis

When the analyze method is called, the analyzer will iterate over each pattern and create a deep copy of the AST to avoid modifying the original tree. It will then call the __analyze method, which will iterate over each statement in the AST and call the analyse_stmt method to analyze the statement.

It also initializes a history dictionary to keep track of the variables that have been analyzed and with the vulnerabilities associated with them (passed through a flow).

A "vulnerability" is an instance of TaintData, which contains the source, source line, a list of sanitizers (instances of SanitizerData containing the sanitizer name and the line where it is located), and a flag implicit that indicates if the vulnerability is implicit (i.e., the source is not directly assigned to the sink).

Analyzing Statements

The analyse_stmt method will check the type of the statement and call the appropriate handler method to analyze it. For example, if the statement is an ExpressionStatement, it will call the analyze_expression_statement method.

We handle the following types of statements:

Expression Statements: The analyze_expression_statement method will call the taint_expression method to taint the expression.
Block Statements: The analyse_block_statement method will iterate over each statement in the block and call the analyse_stmt method to analyze it.
If Statements: An if statement contains a test expression, a consequent block, and an optional alternate block. The analyse_if_statement method will taint the test expression (to handle implicit flows) and then analyze the consequent and alternate blocks. To analyze the blocks, it will create a copy of AST, append the block statements to the copy, and call the __analyze to analyze the flow as if it was a slice.
While Statements: The analyse_while_statement method will taint the test expression (to handle implicit flows) and then analyze the block. To analyze the block, it will perform 3 iterations using the __analyze method to analyze the flow as if it was a slice:
- One iteration without the loop body, to check flows where the loop is not executed.
- One iteration with the loop body, to check flows where the loop is executed once.
- One iteration with the loop body, to check flows where the loop is executed multiple times.

Tainting Expressions

To "taint" expressions, we implemented the taint_expression method, which will check the expression type and call the appropriate handler method to taint it. For example, if the expression is a CallExpression, it will call the taint_call_expression method.

We handle the following types of expressions:

Call Expressions: The taint_call_expression method will call taint_expression for each argument in the call expression. Then, it taints the callee (calling different functions depending on the callee type). It also checks if the callee is a sanitizer, and if so, it will add the sanitizer to all the current tainted variables (vulnerabilities). Finally, it checks if the callee is a sink, and if so, it will call process_sink.
Assignment Expressions: The taint_assignment_expression method will taint the right-hand side of the assignment expression, add the variable to the history dictionary, and check if the variable is a sink. If it is, it will call process_sink.
Identifier Expressions: The taint_identifier_expression method will check if the identifier is one of the sources and, if so, it will create a new vulnerability and add it to the vulnerabilities list. It also checks if the identifier is in the history dictionary and, if so, it will add the vulnerabilities in the history to the current vulnerabilities. Finally, if it is not in the history, it means that the variable is not instantiated in the slice, so it will add a vulnerability with the source and the line where the variable is used.
Binary and Logical Expressions: The taint_binary_expression handles both binary and logical expressions. It will call taint_expression for the left and right expressions.
Unary Expressions: The taint_unary_expression method will call taint_expression for the argument of the unary expression.
Member Expressions: The taint_member_expression method will call taint_expression for the object and property expressions.

Literal expressions are not tainted, as they are not sources of vulnerabilities.

Processing Sinks

Sensitive sinks, as defined in the vulnerability patterns, are identified during the analysis of call expressions and assignment expressions. If tainted data reaches a sink, the flow is marked as vulnerable unless sanitized.

Handling Implicit Flows

When a pattern specifies implicit flows, the tool tracks taint propagation through control flow structures. A custom statement called ImplicitWrapperStatement wraps statements that could lead to implicit vulnerabilities.

Sanitization

Sanitization functions are identified during the analysis of call expressions. If a sanitizer is found, it is applied to all tainted data in the current scope.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
src		src
test		test
.gitignore		.gitignore
.mailmap		.mailmap
5_patterns.json		5_patterns.json
README.md		README.md
generate_all.py		generate_all.py
js_analyser.py		js_analyser.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ist-meic-ssof-g37

Authors

Overview

Prerequisites

Setup

Testing

How the Analyzer Works

Initialization

Analysis

Analyzing Statements

Tainting Expressions

Processing Sinks

Handling Implicit Flows

Sanitization

About

Uh oh!

Contributors 3

Uh oh!

Languages

bodybuilders-team/ist-meic-ssof-g37

Folders and files

Latest commit

History

Repository files navigation

ist-meic-ssof-g37

Authors

Overview

Prerequisites

Setup

Testing

How the Analyzer Works

Initialization

Analysis

Analyzing Statements

Tainting Expressions

Processing Sinks

Handling Implicit Flows

Sanitization

About

Resources

Uh oh!

Stars

Watchers

Forks

Contributors 3

Uh oh!

Languages