58 Commits

Author SHA1 Message Date
f36fcc6e05 updated the way the comment generation and code
refinement inputs are exported (automatized the
putting of archives for context)
2025-06-10 23:40:44 +02:00
f5bdfd1a1b the input to code refinement now ignores paraphrases 2025-06-10 20:45:51 +02:00
dd52e43000 added way to put paraphrases from external csv 2025-06-10 20:42:55 +02:00
45a8122408 using enum choice actoin instead of the previous
thing we were using
2025-06-03 10:10:36 +02:00
66d046cbaa made filename a positional argument 2025-06-03 10:10:19 +02:00
87b49b377d the removal of the is_code_related in field in
selection broke backwards compatilibility. Fixed
it
2025-06-02 10:46:47 +02:00
5b8357567b removed code relatedness from manual selection
since now it's already done by pull_requests
2025-06-02 09:48:27 +02:00
0b182837c1 added new option for dataset 2025-05-27 10:50:10 +02:00
15ffe67b0e added fields to the metadata to make manual
filtering easier
2025-05-20 09:56:54 +02:00
b84ea797ff moved enums to top of file 2025-05-16 19:11:26 +02:00
63c6785b4d when asking for entries that implement the
suggested change, we now only keep the ones that
are code related
2025-05-14 20:53:02 +02:00
c55d21d042 removed unused import 2025-05-14 20:52:52 +02:00
17823e39fe added placeholder for paraphrases 2025-05-14 20:52:37 +02:00
cabf8fd823 addedthe build of the reference map to the dataset 2025-05-14 20:52:17 +02:00
5328fe59e1 moved dependency to somewhere optional 2025-05-14 15:40:52 +02:00
c731fd3393 commented out print 2025-05-14 09:18:36 +02:00
65806ccbe3 now the metadata knows it's archive name 2025-05-14 09:18:11 +02:00
ae516b6c34 removed good and put is_code_related in selection instead 2025-05-14 09:18:06 +02:00
2f2cbae756 fixed typing of parameters for server version of
python
2025-05-12 11:57:30 +02:00
959184b2a8 we can now clean the dataset from useless entries 2025-05-07 10:38:41 +02:00
97646cb8c3 fixed log statements 2025-05-07 10:32:02 +02:00
40fa958cf8 added uuid as id 2025-05-07 10:18:38 +02:00
b3877733cb we can now generate the datasets to be served to users 2025-04-29 15:01:46 +02:00
03b89872dd added selection field to dataset for manual_selection 2025-04-28 09:51:38 +02:00
480dacea3e now using normal names 2025-03-31 15:32:18 +02:00
b482c35b90 cleaned up dataset 2025-03-31 15:31:36 +02:00
669049b7a4 now using only the new datset version 2025-03-31 14:25:17 +02:00
308f58b587 fixed final edgecase 2025-03-30 10:58:48 +02:00
d24c9d8461 removed progress bar that was instant 2025-03-29 10:25:31 +01:00
7e64ab6574 moved github logging to file 2025-03-29 09:44:00 +01:00
dd5a67561b commented out annoying code 2025-03-29 09:43:51 +01:00
e081560879 imported function from utils 2025-03-29 09:43:39 +01:00
aaafe21a3c added progress bar for each entry migration 2025-03-29 09:43:18 +01:00
d7cba34e3d made so that binary files content are ignored 2025-03-28 18:18:36 +01:00
69bf557a61 made migration better 2025-03-28 15:04:01 +01:00
649043d9f0 first draft of migration to augment the data 2025-03-28 11:15:21 +01:00
2e04ed49a3 fixed slight mistake 2025-03-26 14:50:36 +01:00
0d8b81054d formatted code 2025-03-26 13:05:05 +01:00
fa3b7f82a1 comverted code to have instead of one comment,
have a list of them
2025-03-26 12:41:46 +01:00
4c6522ae63 removed useless code 2025-03-26 12:41:01 +01:00
4c56a352e7 added default parameter to keep or not the "Was
still being processed" item from the backup (to delegate to the caller if he wants it)
2025-03-26 09:02:32 +01:00
d95e4ebdf8 fixed slight mistake 2025-03-23 13:55:24 +01:00
dc897ac375 first draft of using cache to resume progress 2025-03-23 09:52:43 +01:00
282f29520b since there are multiple jacoco.xml files
possible and it's too hard to understand which one
is the correct one for the class, I just log in
the coverages each one that have the fully
qualified class that is commented, it will then be
up to us to filter out what's needed
2025-03-21 13:44:58 +01:00
a0e17b62bb removed coverage from each file since now we have
the file pointed by the comment, moved the coverage to the metadata
2025-03-20 14:01:43 +01:00
cd24068d50 added comment path, to know what was the file in
the PR files that was commented
2025-03-20 10:52:16 +01:00
adaa1c7fd4 changed the structure of the entries 2025-03-17 15:45:19 +01:00
92c57b10b2 added the coverage to the file entries 2025-03-17 15:26:51 +01:00
3d9b386f22 changed the structure of the dataset 2025-03-17 15:26:45 +01:00
b2d7377d1f removed the from_json, because we don't use it 2025-03-17 15:26:27 +01:00