|
a8ccf081a2
|
formatted file
|
2025-05-21 09:18:33 +02:00 |
|
|
d48c5d04b8
|
removed stat that has become useless
|
2025-05-20 16:50:21 +02:00 |
|
|
33cea7bbb4
|
added cute little units for the progress bars
|
2025-05-20 16:50:04 +02:00 |
|
|
ea7b510926
|
added another way pr can be invalid, if they have
no lines for their comment (github api be wierd)
|
2025-05-20 16:44:02 +02:00 |
|
|
09ee7995ff
|
made more general function to move logging to file
|
2025-05-20 16:40:26 +02:00 |
|
|
c577b3a6e5
|
saving all the results after any execption
|
2025-05-20 09:59:51 +02:00 |
|
|
e6c5c8df82
|
sorting the values descending, to have the top
most of the given column first
|
2025-05-20 09:59:09 +02:00 |
|
|
975b25f2f6
|
removed print statements
|
2025-05-20 09:58:55 +02:00 |
|
|
a3a89bb346
|
moved some logic around
|
2025-05-20 09:58:49 +02:00 |
|
|
b0443cc87f
|
added exclusion list
|
2025-05-20 09:57:53 +02:00 |
|
|
04a37030f4
|
populating cache only if there is any cache
|
2025-05-20 09:57:29 +02:00 |
|
|
15ffe67b0e
|
added fields to the metadata to make manual
filtering easier
|
2025-05-20 09:56:54 +02:00 |
|
|
3ffbb229b8
|
fixed typo
|
2025-05-20 09:50:00 +02:00 |
|
|
6e0aca2ad5
|
fixed typo
|
2025-05-20 09:48:31 +02:00 |
|
|
4d9c47f33a
|
added all the progress bars for each worker
|
2025-05-17 10:45:57 +02:00 |
|
|
98db478b7b
|
first draft of parallelization (NOT TESTED YET)
|
2025-05-17 09:42:02 +02:00 |
|
|
25072ac8b3
|
removed unused import
|
2025-05-17 09:37:33 +02:00 |
|
|
b90111c652
|
now i'm not crashing when no GITHUB_API_TOKEN is
given, rather just printing a warning
|
2025-05-17 09:37:31 +02:00 |
|
|
970ee1c363
|
added the possibility of sorting the incoming csv
by a certain column, now taking any csv instead of
the result of clone_repos.py
|
2025-05-17 09:32:40 +02:00 |
|
|
3ea3e980bd
|
added metavar names for arguments
|
2025-05-17 09:16:52 +02:00 |
|
|
5cf5e5a8ee
|
added enumchoiceaction for easier enum in argparse
handling
|
2025-05-16 19:41:15 +02:00 |
|
|
14e64984c5
|
instead of adding the cache as we go through the
repos, just add it before any processing, so we are sure to keep all the previously saved data
|
2025-05-16 19:39:59 +02:00 |
|
|
b84ea797ff
|
moved enums to top of file
|
2025-05-16 19:11:26 +02:00 |
|
|
25161c4d46
|
refactored manual selection into smaller bits,
easier to consume
|
2025-05-16 09:58:14 +02:00 |
|
|
63c6785b4d
|
when asking for entries that implement the
suggested change, we now only keep the ones that
are code related
|
2025-05-14 20:53:02 +02:00 |
|
|
c55d21d042
|
removed unused import
|
2025-05-14 20:52:52 +02:00 |
|
|
17823e39fe
|
added placeholder for paraphrases
|
2025-05-14 20:52:37 +02:00 |
|
|
cabf8fd823
|
addedthe build of the reference map to the dataset
|
2025-05-14 20:52:17 +02:00 |
|
|
5328fe59e1
|
moved dependency to somewhere optional
|
2025-05-14 15:40:52 +02:00 |
|
|
5403ff5d4d
|
added code to extract the expected output given a
dataset (to test the website)
|
2025-05-14 15:40:10 +02:00 |
|
|
acce738872
|
made the requests never expire
|
2025-05-14 09:39:55 +02:00 |
|
|
a13a29c6de
|
removed buggy continue
|
2025-05-14 09:39:33 +02:00 |
|
|
726a0d92e1
|
using output (forgot to commit it before)
|
2025-05-14 09:36:52 +02:00 |
|
|
0b02518374
|
found out that some couldn't checkout due to
conflicts, but if you force it, it works
|
2025-05-14 09:36:12 +02:00 |
|
|
ccd962c205
|
now using the metadata to get archive name
|
2025-05-14 09:36:06 +02:00 |
|
|
1f91acf6c1
|
moved click to optional requirments
|
2025-05-14 09:19:08 +02:00 |
|
|
c731fd3393
|
commented out print
|
2025-05-14 09:18:36 +02:00 |
|
|
65806ccbe3
|
now the metadata knows it's archive name
|
2025-05-14 09:18:11 +02:00 |
|
|
ae516b6c34
|
removed good and put is_code_related in selection instead
|
2025-05-14 09:18:06 +02:00 |
|
|
ea3a2b72e5
|
added output to manual_selection in order not to
overwrite every time
|
2025-05-14 09:16:31 +02:00 |
|
|
2f2cbae756
|
fixed typing of parameters for server version of
python
|
2025-05-12 11:57:30 +02:00 |
|
|
a701dc236c
|
when selecting, we can now choose which diff hunk
is relevant and modify it if necessary
|
2025-05-07 10:39:21 +02:00 |
|
|
959184b2a8
|
we can now clean the dataset from useless entries
|
2025-05-07 10:38:41 +02:00 |
|
|
36b7dc5c02
|
added uuid when creating the dataset
|
2025-05-07 10:38:20 +02:00 |
|
|
af89051779
|
prompts can now have a default value when enter is
pressed
|
2025-05-07 10:35:27 +02:00 |
|
|
97646cb8c3
|
fixed log statements
|
2025-05-07 10:32:02 +02:00 |
|
|
40fa958cf8
|
added uuid as id
|
2025-05-07 10:18:38 +02:00 |
|
|
b3877733cb
|
we can now generate the datasets to be served to users
|
2025-04-29 15:01:46 +02:00 |
|
|
bde9d45c10
|
implemented the manual selection script
|
2025-04-29 14:40:58 +02:00 |
|
|
03b89872dd
|
added selection field to dataset for manual_selection
|
2025-04-28 09:51:38 +02:00 |
|