Skip to content

SQUARE-RG/AutoChecker

Repository files navigation

AutoChecker

AutoChecker is a tool that automatically generates Java checkers for code checking tools with the support of LLM, taking rule text descriptions and rule test cases as input.

Overview: Overview

Logic-guided API-Context Retrieval in overview:

Logic-guided API-Context Retrieval

What code checking tools can we generate checkers for currently?

AutoChecker can directly generate Java checkers for PMD, and is applicable to all code checking tools which support Java code checking based on AST traversal after making small adjustments.

Tool Playground

We provide a tool demonstration website at https://autochecker.maskeduser.party.

Repository Contents

Directory tool -- Tool Implementation

  • entity: It stores data structures of three key entities(rule, test case and checker).
  • retriever: It stores semantic matcher and scripts for retrieving.
  • utils: It stores some intermediate data and auxiliary scripts.
  • generator: The source code of AutoChecker.
  • main.py: Entry of AutoChecker.

Directory framework -- Something Useful of Code Checking Framework Where You Choose to Write Checker

Taking PMD framework as an example.

  • pmd_db: Content of two DBs(Full-API DB and Meta-API DB) constructed on PMD framework.
  • pmd_project: Source code of framework.
  • PMD-Style-ASTParser.jar: AST parser in code checking framework.

If you choose other code checking tools to generate checker, this information can also be quickly constructed (refer to our paper).

Directory experiment -- Experimental Results

  • (Setup) experimental rules: Experimental_20rules.json.
  • (Setup) rules-related test case set: experimental-20rules-test-suite.
  • (RQ1) baselines: Results of baselines experiment.
  • (RQ1) autochecker: Results of AutoChecker evaluation experiment.
  • (RQ2) ablation: Results of ablation experiment.
  • (RQ4) practice: Detailed data about RQ4.
    • Files ended with ".xml": Additionally added test cases in practice.
    • Files ended with ".txt": The augmented checker after iterating those added test cases.

Detailed experimental results statistics(the best data marked in purple are taken in our paper)

Approach LLM #Rulepc #Rulepot #Rulepat #TCpass TPRavg
NoCaseLLM Llama3.1 0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
Qwen2.5-Coder 5 5 1 30 17.82
4 3 1 30 11.87
5 5 1 40 19.41
GPT-4 4 4 0 36 15.57
4 4 1 37 13.89
7 7 1 62 27.92
DeepSeek-V3 7 7 1 42 24.38
8 8 1 56 28.06
7 7 1 54 22.79
AllCasesLLM Llama3.1 0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
Qwen2.5-Coder 3 3 1 14 13.04
4 4 1 17 14.40
3 3 1 14 13.04
GPT-4 5 5 1 38 15.84
5 5 2 36 21.53
6 5 2 52 19.40
DeepSeek-V3 6 6 2 43 24.60
6 6 2 50 23.26
5 4 2 18 15.89
NoCaseLLMR Llama3.1 1 1 0 1 2.50
0 0 0 0 0
2 2 0 16 4.71
Qwen2.5-Coder 7 7 0 43 22.90
8 8 1 53 27.76
9 9 2 60 30.68
GPT-4 10 10 1 108 30.82
9 9 1 92 25.95
6 6 1 38 22.73
DeepSeek-V3 9 9 2 92 32.05
10 9 1 94 27.84
10 10 1 58 27.60
NoCaseLLMC Llama3.1 0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
Qwen2.5-Coder 5 5 1 26 15.74
6 6 1 45 21.18
5 5 1 30 18.60
GPT-4 7 7 0 42 22.19
8 8 1 94 27.26
8 8 1 59 22.86
DeepSeek-V3 9 9 0 66 29.40
8 8 1 52 28.71
9 9 1 105 27.97
NoCaseLLMRC Llama3.1 0 0 0 0 0
2 2 0 7 6.25
2 2 0 8 4.94
Qwen2.5-Coder 9 9 1 60 30.49
8 8 1 55 27.09
8 7 1 41 24.19
GPT-4 8 7 1 34 22.87
8 8 1 105 27.74
8 8 0 39 22.50
DeepSeek-V3 11 11 1 101 38.93
11 11 1 119 33.83
11 11 1 108 32.09
AutoChecker Llama3.1 3 3 1 22 8.41
2 2 0 17 4.12
1 1 0 17 4.47
Qwen2.5-Coder 20 20 4 257 79.01
18 18 3 226 65.62
18 18 2 241 68.90
GPT-4 20 20 6 278 82.28
17 17 6 228 66.63
20 20 5 261 79.36
DeepSeek-V3 18 18 5 257 76.59
19 19 4 278 80.86
19 19 5 264 75.86
AutoCheckerWoR GPT-4 18 18 2 230 61.88
15 15 3 225 56.38
18 18 2 231 67.27
DeepSeek-V3 15 15 2 221 59.17
14 14 2 229 58.03
14 14 2 210 55.83
AutoCheckerWoI GPT-4 6 6 2 31 24.16
8 8 2 65 29.37
5 5 2 33 19.24
DeepSeek-V3 14 14 4 141 53.44
15 15 3 162 52.79
13 13 3 110 42.96
AutoCheckerWoM GPT-4 17 17 3 256 66.42
16 16 1 189 54.15
16 16 3 239 60.09
DeepSeek-V3 18 18 1 258 72.92
15 15 2 215 60.30
15 15 1 209 59.13

About

A tool that codes checker automatically with rule description and test case set input

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published