+ When you download a dataset, you'll receive a ZIP archive containing a JSON file. The structure of + this file depends on the selected task. +
+ +The JSON maps each ID to an object with:
+{
+ "1234": {
+ "files": {
+ "src/Main.java": "public class Main { ... }"
+ },
+ "diffs": {
+ "src/Main.java": "@@ -1,3 +1,6 @@ ..."
+ }
+ }
+}
+ The JSON structure is similar to comment generation, with one additional field:
+{
+ "5678": {
+ "files": { ... },
+ "diffs": { ... },
+ "comments": [
+ {
+ "body": "Consider simplifying this logic.",
+ "file": "src/Util.java",
+ "location": {
+ "start_line": 42,
+ "end_line": 45
+ }
+ }
+ ]
+ }
+}
+ + You can choose to download the dataset with full repository context — the state of the entire + codebase at the time the PR was created. This may help your model better understand the broader + project structure and dependencies outside of the changed files. +