Commit Graph

243 Commits

Author SHA1 Message Date
0b02518374 found out that some couldn't checkout due to
conflicts, but if you force it, it works
2025-05-14 09:36:12 +02:00
ccd962c205 now using the metadata to get archive name 2025-05-14 09:36:06 +02:00
1f91acf6c1 moved click to optional requirments 2025-05-14 09:19:08 +02:00
c731fd3393 commented out print 2025-05-14 09:18:36 +02:00
65806ccbe3 now the metadata knows it's archive name 2025-05-14 09:18:11 +02:00
ae516b6c34 removed good and put is_code_related in selection instead 2025-05-14 09:18:06 +02:00
ea3a2b72e5 added output to manual_selection in order not to
overwrite every time
2025-05-14 09:16:31 +02:00
2f2cbae756 fixed typing of parameters for server version of
python
2025-05-12 11:57:30 +02:00
a701dc236c when selecting, we can now choose which diff hunk
is relevant and modify it if necessary
2025-05-07 10:39:21 +02:00
959184b2a8 we can now clean the dataset from useless entries 2025-05-07 10:38:41 +02:00
36b7dc5c02 added uuid when creating the dataset 2025-05-07 10:38:20 +02:00
af89051779 prompts can now have a default value when enter is
pressed
2025-05-07 10:35:27 +02:00
97646cb8c3 fixed log statements 2025-05-07 10:32:02 +02:00
40fa958cf8 added uuid as id 2025-05-07 10:18:38 +02:00
b3877733cb we can now generate the datasets to be served to users 2025-04-29 15:01:46 +02:00
bde9d45c10 implemented the manual selection script 2025-04-29 14:40:58 +02:00
03b89872dd added selection field to dataset for manual_selection 2025-04-28 09:51:38 +02:00
e9816d4492 removed unused imports 2025-04-05 16:01:28 +02:00
bf8869e66c was accidentally copying over prs that were cached
twice
2025-04-01 15:45:23 +02:00
d4dd72469e instead of creating a list of the comments, using
the paginated list and totalCount
2025-04-01 14:46:42 +02:00
e2f313a62a made better argparse things 2025-04-01 12:15:24 +02:00
12b98bf1ef removed the throttle of pygithub to make requests
faster
2025-04-01 11:45:43 +02:00
6d28d89873 added return guard to remove indent level 2025-04-01 11:01:06 +02:00
bc71a21c30 instead of leaving reason_for_failure empty for
valid PRs, I now put that it's valid (even tho it's not a reason for _failure_ techinally, gne gne gne...)
2025-04-01 11:00:11 +02:00
a362aba344 added a simple caching of the requests to make it
much quicker to fail and restart
2025-04-01 10:14:45 +02:00
a24ffa00fc made help message shorter 2025-04-01 10:14:19 +02:00
b9d1923bd8 since the comment file might not be in the PR
files (since it was reverted back to its original state, we manually need to check if it's code related)
2025-04-01 09:53:26 +02:00
f7d70eed6c fixed how we get the diffs before (it was wrong),
extracted the way to get the last commit before the comments
2025-04-01 09:52:57 +02:00
af4fbaa7f3 added the type of the error in the print, because
some errors are not very verbose in what's going wrong
2025-04-01 09:48:24 +02:00
c31686ad63 not compiling, testing, etc. for files that are
not code related
2025-04-01 09:20:37 +02:00
0b238db879 fixed the name of the archive 2025-04-01 09:20:26 +02:00
28eebf158a some users have been deleted since, so the user
attribute of the comment is None
2025-04-01 09:19:52 +02:00
306e80648b added print statements 2025-04-01 09:19:35 +02:00
e697890395 now archiving both before (context for AI) and after (tests for benchmark) 2025-03-31 21:50:11 +02:00
d8ab48dc82 excluding repos that have no comments 2025-03-31 21:25:08 +02:00
352758a600 not failing on unexpected error, but writing them
above the progress bar
2025-03-31 21:24:21 +02:00
c20f9d6a6c made checkout function more modular 2025-03-31 21:20:56 +02:00
f79d3d7807 made a better job to check if the second comment
is an answer to the first one
2025-03-31 21:20:17 +02:00
1a53f28ae0 now archiving the repo at the given pr 2025-03-31 15:55:23 +02:00
306aa4a985 removed the only_inject_jacoco 2025-03-31 15:49:25 +02:00
480dacea3e now using normal names 2025-03-31 15:32:18 +02:00
b482c35b90 cleaned up dataset 2025-03-31 15:31:36 +02:00
6bd30ef545 removed unused imports 2025-03-31 15:31:04 +02:00
f785364fb8 made a unique bar for the processing of the pr 2025-03-31 15:30:14 +02:00
abc642d969 fixed slight issue with naming of variables 2025-03-31 15:29:43 +02:00
941e0cb19f fixed the way we get the diffs after 2025-03-31 15:29:18 +02:00
61ed6aa1b9 fixed mistake 2025-03-31 15:29:02 +02:00
669049b7a4 now using only the new datset version 2025-03-31 14:25:17 +02:00
35bd296c7c made clone use raising expections instead of updates 2025-03-31 13:21:04 +02:00
46d8d45d7c Formatted utils.py 2025-03-31 11:49:36 +02:00