AutoChecker is a tool that automatically generates Java checkers for code checking tools with the support of LLM, taking rule text descriptions and rule test cases as input.
Logic-guided API-Context Retrieval in overview:
AutoChecker can directly generate Java checkers for PMD, and is applicable to all code checking tools which support Java code checking based on AST traversal after making small adjustments.
We provide a tool demonstration website at https://autochecker.maskeduser.party.
entity: It stores data structures of three key entities(rule, test case and checker).retriever: It stores semantic matcher and scripts for retrieving.utils: It stores some intermediate data and auxiliary scripts.generator: The source code of AutoChecker.main.py: Entry of AutoChecker.
Directory framework -- Something Useful of Code Checking Framework Where You Choose to Write Checker
Taking PMD framework as an example.
pmd_db: Content of two DBs(Full-API DB and Meta-API DB) constructed on PMD framework.pmd_project: Source code of framework.PMD-Style-ASTParser.jar: AST parser in code checking framework.
If you choose other code checking tools to generate checker, this information can also be quickly constructed (refer to our paper).
- (Setup) experimental rules:
Experimental_20rules.json. - (Setup) rules-related test case set:
experimental-20rules-test-suite. - (RQ1)
baselines: Results of baselines experiment. - (RQ1)
autochecker: Results of AutoChecker evaluation experiment. - (RQ2)
ablation: Results of ablation experiment. - (RQ4)
practice: Detailed data about RQ4.- Files ended with ".xml": Additionally added test cases in practice.
- Files ended with ".txt": The augmented checker after iterating those added test cases.
| Approach | LLM | #Rulepc | #Rulepot | #Rulepat | #TCpass | TPRavg |
|---|---|---|---|---|---|---|
| NoCaseLLM | Llama3.1 | 0 | 0 | 0 | 0 | 0 |
| 0 | 0 | 0 | 0 | 0 | ||
| 0 | 0 | 0 | 0 | 0 | ||
| Qwen2.5-Coder | 5 | 5 | 1 | 30 | 17.82 | |
| 4 | 3 | 1 | 30 | 11.87 | ||
| 5 | 5 | 1 | 40 | 19.41 | ||
| GPT-4 | 4 | 4 | 0 | 36 | 15.57 | |
| 4 | 4 | 1 | 37 | 13.89 | ||
| 7 | 7 | 1 | 62 | 27.92 | ||
| DeepSeek-V3 | 7 | 7 | 1 | 42 | 24.38 | |
| 8 | 8 | 1 | 56 | 28.06 | ||
| 7 | 7 | 1 | 54 | 22.79 | ||
| AllCasesLLM | Llama3.1 | 0 | 0 | 0 | 0 | 0 |
| 0 | 0 | 0 | 0 | 0 | ||
| 0 | 0 | 0 | 0 | 0 | ||
| Qwen2.5-Coder | 3 | 3 | 1 | 14 | 13.04 | |
| 4 | 4 | 1 | 17 | 14.40 | ||
| 3 | 3 | 1 | 14 | 13.04 | ||
| GPT-4 | 5 | 5 | 1 | 38 | 15.84 | |
| 5 | 5 | 2 | 36 | 21.53 | ||
| 6 | 5 | 2 | 52 | 19.40 | ||
| DeepSeek-V3 | 6 | 6 | 2 | 43 | 24.60 | |
| 6 | 6 | 2 | 50 | 23.26 | ||
| 5 | 4 | 2 | 18 | 15.89 | ||
| NoCaseLLMR | Llama3.1 | 1 | 1 | 0 | 1 | 2.50 |
| 0 | 0 | 0 | 0 | 0 | ||
| 2 | 2 | 0 | 16 | 4.71 | ||
| Qwen2.5-Coder | 7 | 7 | 0 | 43 | 22.90 | |
| 8 | 8 | 1 | 53 | 27.76 | ||
| 9 | 9 | 2 | 60 | 30.68 | ||
| GPT-4 | 10 | 10 | 1 | 108 | 30.82 | |
| 9 | 9 | 1 | 92 | 25.95 | ||
| 6 | 6 | 1 | 38 | 22.73 | ||
| DeepSeek-V3 | 9 | 9 | 2 | 92 | 32.05 | |
| 10 | 9 | 1 | 94 | 27.84 | ||
| 10 | 10 | 1 | 58 | 27.60 | ||
| NoCaseLLMC | Llama3.1 | 0 | 0 | 0 | 0 | 0 |
| 0 | 0 | 0 | 0 | 0 | ||
| 0 | 0 | 0 | 0 | 0 | ||
| Qwen2.5-Coder | 5 | 5 | 1 | 26 | 15.74 | |
| 6 | 6 | 1 | 45 | 21.18 | ||
| 5 | 5 | 1 | 30 | 18.60 | ||
| GPT-4 | 7 | 7 | 0 | 42 | 22.19 | |
| 8 | 8 | 1 | 94 | 27.26 | ||
| 8 | 8 | 1 | 59 | 22.86 | ||
| DeepSeek-V3 | 9 | 9 | 0 | 66 | 29.40 | |
| 8 | 8 | 1 | 52 | 28.71 | ||
| 9 | 9 | 1 | 105 | 27.97 | ||
| NoCaseLLMRC | Llama3.1 | 0 | 0 | 0 | 0 | 0 |
| 2 | 2 | 0 | 7 | 6.25 | ||
| 2 | 2 | 0 | 8 | 4.94 | ||
| Qwen2.5-Coder | 9 | 9 | 1 | 60 | 30.49 | |
| 8 | 8 | 1 | 55 | 27.09 | ||
| 8 | 7 | 1 | 41 | 24.19 | ||
| GPT-4 | 8 | 7 | 1 | 34 | 22.87 | |
| 8 | 8 | 1 | 105 | 27.74 | ||
| 8 | 8 | 0 | 39 | 22.50 | ||
| DeepSeek-V3 | 11 | 11 | 1 | 101 | 38.93 | |
| 11 | 11 | 1 | 119 | 33.83 | ||
| 11 | 11 | 1 | 108 | 32.09 | ||
| AutoChecker | Llama3.1 | 3 | 3 | 1 | 22 | 8.41 |
| 2 | 2 | 0 | 17 | 4.12 | ||
| 1 | 1 | 0 | 17 | 4.47 | ||
| Qwen2.5-Coder | 20 | 20 | 4 | 257 | 79.01 | |
| 18 | 18 | 3 | 226 | 65.62 | ||
| 18 | 18 | 2 | 241 | 68.90 | ||
| GPT-4 | 20 | 20 | 6 | 278 | 82.28 | |
| 17 | 17 | 6 | 228 | 66.63 | ||
| 20 | 20 | 5 | 261 | 79.36 | ||
| DeepSeek-V3 | 18 | 18 | 5 | 257 | 76.59 | |
| 19 | 19 | 4 | 278 | 80.86 | ||
| 19 | 19 | 5 | 264 | 75.86 | ||
| AutoCheckerWoR | GPT-4 | 18 | 18 | 2 | 230 | 61.88 |
| 15 | 15 | 3 | 225 | 56.38 | ||
| 18 | 18 | 2 | 231 | 67.27 | ||
| DeepSeek-V3 | 15 | 15 | 2 | 221 | 59.17 | |
| 14 | 14 | 2 | 229 | 58.03 | ||
| 14 | 14 | 2 | 210 | 55.83 | ||
| AutoCheckerWoI | GPT-4 | 6 | 6 | 2 | 31 | 24.16 |
| 8 | 8 | 2 | 65 | 29.37 | ||
| 5 | 5 | 2 | 33 | 19.24 | ||
| DeepSeek-V3 | 14 | 14 | 4 | 141 | 53.44 | |
| 15 | 15 | 3 | 162 | 52.79 | ||
| 13 | 13 | 3 | 110 | 42.96 | ||
| AutoCheckerWoM | GPT-4 | 17 | 17 | 3 | 256 | 66.42 |
| 16 | 16 | 1 | 189 | 54.15 | ||
| 16 | 16 | 3 | 239 | 60.09 | ||
| DeepSeek-V3 | 18 | 18 | 1 | 258 | 72.92 | |
| 15 | 15 | 2 | 215 | 60.30 | ||
| 15 | 15 | 1 | 209 | 59.13 |

