Commit Graph

263 Commits

Author SHA1 Message Date
Karma Riuk
6e0aca2ad5 fixed typo 2025-05-20 09:48:31 +02:00
Karma Riuk
4d9c47f33a added all the progress bars for each worker 2025-05-17 10:45:57 +02:00
Karma Riuk
98db478b7b first draft of parallelization (NOT TESTED YET) 2025-05-17 09:42:02 +02:00
Karma Riuk
25072ac8b3 removed unused import 2025-05-17 09:37:33 +02:00
Karma Riuk
b90111c652 now i'm not crashing when no GITHUB_API_TOKEN is
given, rather just printing a warning
2025-05-17 09:37:31 +02:00
Karma Riuk
970ee1c363 added the possibility of sorting the incoming csv
by a certain column, now taking any csv instead of
the result of clone_repos.py
2025-05-17 09:32:40 +02:00
Karma Riuk
3ea3e980bd added metavar names for arguments 2025-05-17 09:16:52 +02:00
Karma Riuk
5cf5e5a8ee added enumchoiceaction for easier enum in argparse
handling
2025-05-16 19:41:15 +02:00
Karma Riuk
14e64984c5 instead of adding the cache as we go through the
repos, just add it before any processing, so we are sure to keep all the previously saved data
2025-05-16 19:39:59 +02:00
Karma Riuk
b84ea797ff moved enums to top of file 2025-05-16 19:11:26 +02:00
Karma Riuk
25161c4d46 refactored manual selection into smaller bits,
easier to consume
2025-05-16 09:58:14 +02:00
Karma Riuk
63c6785b4d when asking for entries that implement the
suggested change, we now only keep the ones that
are code related
2025-05-14 20:53:02 +02:00
Karma Riuk
c55d21d042 removed unused import 2025-05-14 20:52:52 +02:00
Karma Riuk
17823e39fe added placeholder for paraphrases 2025-05-14 20:52:37 +02:00
Karma Riuk
cabf8fd823 addedthe build of the reference map to the dataset 2025-05-14 20:52:17 +02:00
Karma Riuk
5328fe59e1 moved dependency to somewhere optional 2025-05-14 15:40:52 +02:00
Karma Riuk
5403ff5d4d added code to extract the expected output given a
dataset (to test the website)
2025-05-14 15:40:10 +02:00
Karma Riuk
acce738872 made the requests never expire 2025-05-14 09:39:55 +02:00
Karma Riuk
a13a29c6de removed buggy continue 2025-05-14 09:39:33 +02:00
Karma Riuk
726a0d92e1 using output (forgot to commit it before) 2025-05-14 09:36:52 +02:00
Karma Riuk
0b02518374 found out that some couldn't checkout due to
conflicts, but if you force it, it works
2025-05-14 09:36:12 +02:00
Karma Riuk
ccd962c205 now using the metadata to get archive name 2025-05-14 09:36:06 +02:00
Karma Riuk
1f91acf6c1 moved click to optional requirments 2025-05-14 09:19:08 +02:00
Karma Riuk
c731fd3393 commented out print 2025-05-14 09:18:36 +02:00
Karma Riuk
65806ccbe3 now the metadata knows it's archive name 2025-05-14 09:18:11 +02:00
Karma Riuk
ae516b6c34 removed good and put is_code_related in selection instead 2025-05-14 09:18:06 +02:00
Karma Riuk
ea3a2b72e5 added output to manual_selection in order not to
overwrite every time
2025-05-14 09:16:31 +02:00
Karma Riuk
2f2cbae756 fixed typing of parameters for server version of
python
2025-05-12 11:57:30 +02:00
Karma Riuk
a701dc236c when selecting, we can now choose which diff hunk
is relevant and modify it if necessary
2025-05-07 10:39:21 +02:00
Karma Riuk
959184b2a8 we can now clean the dataset from useless entries 2025-05-07 10:38:41 +02:00
Karma Riuk
36b7dc5c02 added uuid when creating the dataset 2025-05-07 10:38:20 +02:00
Karma Riuk
af89051779 prompts can now have a default value when enter is
pressed
2025-05-07 10:35:27 +02:00
Karma Riuk
97646cb8c3 fixed log statements 2025-05-07 10:32:02 +02:00
Karma Riuk
40fa958cf8 added uuid as id 2025-05-07 10:18:38 +02:00
Karma Riuk
b3877733cb we can now generate the datasets to be served to users 2025-04-29 15:01:46 +02:00
Karma Riuk
bde9d45c10 implemented the manual selection script 2025-04-29 14:40:58 +02:00
Karma Riuk
03b89872dd added selection field to dataset for manual_selection 2025-04-28 09:51:38 +02:00
Karma Riuk
e9816d4492 removed unused imports 2025-04-05 16:01:28 +02:00
Karma Riuk
bf8869e66c was accidentally copying over prs that were cached
twice
2025-04-01 15:45:23 +02:00
Karma Riuk
d4dd72469e instead of creating a list of the comments, using
the paginated list and totalCount
2025-04-01 14:46:42 +02:00
Karma Riuk
e2f313a62a made better argparse things 2025-04-01 12:15:24 +02:00
Karma Riuk
12b98bf1ef removed the throttle of pygithub to make requests
faster
2025-04-01 11:45:43 +02:00
Karma Riuk
6d28d89873 added return guard to remove indent level 2025-04-01 11:01:06 +02:00
Karma Riuk
bc71a21c30 instead of leaving reason_for_failure empty for
valid PRs, I now put that it's valid (even tho it's not a reason for _failure_ techinally, gne gne gne...)
2025-04-01 11:00:11 +02:00
Karma Riuk
a362aba344 added a simple caching of the requests to make it
much quicker to fail and restart
2025-04-01 10:14:45 +02:00
Karma Riuk
a24ffa00fc made help message shorter 2025-04-01 10:14:19 +02:00
Karma Riuk
b9d1923bd8 since the comment file might not be in the PR
files (since it was reverted back to its original state, we manually need to check if it's code related)
2025-04-01 09:53:26 +02:00
Karma Riuk
f7d70eed6c fixed how we get the diffs before (it was wrong),
extracted the way to get the last commit before the comments
2025-04-01 09:52:57 +02:00
Karma Riuk
af4fbaa7f3 added the type of the error in the print, because
some errors are not very verbose in what's going wrong
2025-04-01 09:48:24 +02:00
Karma Riuk
c31686ad63 not compiling, testing, etc. for files that are
not code related
2025-04-01 09:20:37 +02:00