mirror of
https://github.com/karma-riuk/crab-webapp.git
synced 2025-07-04 22:08:12 +02:00
365 lines
16 KiB
HTML
365 lines
16 KiB
HTML
<!DOCTYPE html>
|
||
<html lang="en">
|
||
|
||
<head>
|
||
<meta charset="UTF-8" />
|
||
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
||
<link rel="icon" type="image/x-icon" href="/img/crab.png">
|
||
<title>CRAB Webapp</title>
|
||
<link rel="stylesheet" href="css/style.css">
|
||
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css">
|
||
<script src="https://cdn.socket.io/4.5.4/socket.io.min.js"></script>
|
||
<script defer src="js/index.js"></script>
|
||
<script defer src="js/sorttable.js"></script>
|
||
<script defer src="js/modal.js"></script>
|
||
</head>
|
||
|
||
<body>
|
||
<header class="site-header">
|
||
<h3 id="page-title">
|
||
<img src="/img/crab.png" alt="Crab" class="crab-icon">
|
||
Crab Webapp
|
||
</h3>
|
||
<button id="about-button" class="info-btn">About</button>
|
||
</header>
|
||
<main>
|
||
|
||
<fieldset>
|
||
<legend>
|
||
<strong>Download a Dataset</strong>
|
||
<button id="info-download-btn" class='info-btn'><i class="fa fa-info"></i></button>
|
||
</legend>
|
||
|
||
<label for="dataset-select">Dataset:</label>
|
||
<select id="dataset-select">
|
||
<option value="comment_generation">Comment Generation</option>
|
||
<option value="code_refinement">Code Refinement</option>
|
||
</select>
|
||
<label>
|
||
<input type="checkbox" id="with-context">
|
||
Include context
|
||
</label>
|
||
<br /><br />
|
||
<button id="download-dataset">Download</button>
|
||
</fieldset>
|
||
|
||
<fieldset>
|
||
<legend>
|
||
<strong>Upload Your Predictions</strong>
|
||
<button id="info-upload-btn" class='info-btn'><i class="fa fa-info"></i></button>
|
||
</legend>
|
||
<label for="answer-type">Type:</label>
|
||
<select id="answer-type">
|
||
<option value="comment">Comment Generation</option>
|
||
<option value="refinement">Code Refinement</option>
|
||
</select>
|
||
<br /><br />
|
||
<input type="file" id="file-input" accept="application/json" />
|
||
<br /><br />
|
||
<div style="display: flex; align-items: center; gap: 0.5em">
|
||
<button id="upload-btn">Upload JSON</button>
|
||
<div id="upload-status" class="hidden" style="color: green;"></div>
|
||
</div>
|
||
</fieldset>
|
||
|
||
<fieldset>
|
||
<legend>
|
||
<strong>Getting Information About Ongoing Process</strong>
|
||
<button id="info-results-btn" class='info-btn'><i class="fa fa-info"></i></button>
|
||
</legend>
|
||
<div style="display: flex; align-items: center; gap: 0.5em">
|
||
<label for="uuid">Process id:</label>
|
||
<input type="text" id="uuid" required />
|
||
</div>
|
||
<br /><br />
|
||
<button id="request-status">Request status</button>
|
||
<span id="status-status" class="hidden" style="color: green;"> </span>
|
||
</fieldset>
|
||
|
||
|
||
<div id="progress-container" class="hidden">
|
||
<h3>Processing answers...</h3>
|
||
<div>
|
||
<progress id="progress-bar" value="0" max="100"></progress>
|
||
<span id="progress-text">0%</span>
|
||
</div>
|
||
</div>
|
||
|
||
<div id="comment" class="results-container hidden">
|
||
<h3>Results Comment Generation<button class="download-results">Download</button></h3>
|
||
<table class="sortable">
|
||
<thead>
|
||
<tr>
|
||
<th>id</th>
|
||
<th>Proposed comment</th>
|
||
<th>Correct file</th>
|
||
<th>Distance</th>
|
||
<th>Max bleu score</th>
|
||
</tr>
|
||
</thead>
|
||
<tbody>
|
||
</tbody>
|
||
</table>
|
||
</div>
|
||
|
||
<div id="refinement" class="results-container hidden">
|
||
<h3>Results Code Refinement<button class="download-results">Download</button></h3>
|
||
<table class="sortable">
|
||
<thead>
|
||
<tr>
|
||
<th>id</th>
|
||
<th>Compiled</th>
|
||
<th>Tested</th>
|
||
</tr>
|
||
</thead>
|
||
<tbody>
|
||
</tbody>
|
||
</table>
|
||
</div>
|
||
|
||
<div id="modal-overlay" class="modal-overlay hidden" tabindex=-1>
|
||
<div class="modal-container">
|
||
<button id="modal-close" class="modal-close" aria-label="Close">×</button>
|
||
<div id="modal-content"></div>
|
||
</div>
|
||
</div>
|
||
|
||
<template id="about">
|
||
<h2>About this project</h2>
|
||
<div>
|
||
<p>
|
||
This project introduces <strong>CRAB (Code Review Automated Benchmark)</strong>, a high-quality
|
||
benchmark designed to evaluate deep learning-based code review automation tools. It focuses on two
|
||
key tasks:
|
||
</p>
|
||
<ul>
|
||
<li><strong>Comment Generation</strong>: Generating natural language review comments that identify
|
||
issues and suggest improvements for a given piece of code.</li>
|
||
<li><strong>Code Refinement</strong>: Producing revised code that correctly implements the
|
||
suggestions from a review comment.</li>
|
||
</ul>
|
||
<p>
|
||
The dataset consists of
|
||
carefully curated triplets
|
||
<code><submitted_code, reviewer_comment, revised_code></code>—ensuring each comment is
|
||
actionable and each revision implements the suggested change. This eliminates noise common in
|
||
previous datasets and supports reliable, meaningful evaluation.
|
||
</p>
|
||
<p>
|
||
To support model benchmarking, we also provide a web-based evaluation platform (the website on which
|
||
you are reading this description) that allows researchers to download the dataset, submit their
|
||
predictions, and assess model performance across both tasks.
|
||
</p>
|
||
<p>
|
||
You can explore the source code for each component here:
|
||
</p>
|
||
<ul>
|
||
<li><a href="https://github.com/karma-riuk/crab" target="_blank">Dataset Construction Repository</a>
|
||
</li>
|
||
<li><a href="https://github.com/karma-riuk/crab-webapp" target="_blank">Web App Repository</a></li>
|
||
</ul>
|
||
|
||
<p>
|
||
This website lets you evaluate code review models against the CRAB benchmark. You can download input
|
||
files for either the comment generation or code refinement task, upload your model’s predictions,
|
||
and view the results once processing is complete. Each section includes a help icon
|
||
<button class='info-btn'><i class="fa fa-info"></i></button> that provides more detailed
|
||
instructions and file format guidelines.
|
||
</p>
|
||
</div>
|
||
</template>
|
||
|
||
<template id="info-download">
|
||
<h2>Download a Dataset</h2>
|
||
<div>
|
||
<p>
|
||
When you download a dataset, you'll receive a ZIP archive containing a JSON file. The structure of
|
||
this file depends on the selected task.
|
||
</p>
|
||
|
||
<section class="json-schemas">
|
||
<details>
|
||
<summary><strong>Comment Generation</strong></summary>
|
||
<p>The JSON maps each ID to an object with:</p>
|
||
<ul>
|
||
<li><strong>files</strong>: a map of filenames to their content at the start of the pull
|
||
request.</li>
|
||
<li><strong>diffs</strong>: a map of filenames to the diff that was applied to each file
|
||
before
|
||
the comment was made.</li>
|
||
</ul>
|
||
<pre><code>[
|
||
{
|
||
"id": "1234",
|
||
"files": {
|
||
"src/Main.java": "public class Main { ... }"
|
||
},
|
||
"diffs": {
|
||
"src/Main.java": "@@ -1,3 +1,6 @@ ..."
|
||
}
|
||
}
|
||
]</code></pre>
|
||
</details>
|
||
|
||
<details>
|
||
<summary><strong>Code Refinement</strong></summary>
|
||
<p>The JSON structure is similar to comment generation, with one additional field:</p>
|
||
<ul>
|
||
<li><strong>files</strong>: the initial version of each file in the PR.</li>
|
||
<li><strong>diffs</strong>: the diff applied before the comment was made.</li>
|
||
<li><strong>comments</strong>: a list of comments, each with a body, the file it refers to,
|
||
and
|
||
the exact location of the comment.</li>
|
||
</ul>
|
||
<pre><code lang="json">[
|
||
{
|
||
"id": "5678",
|
||
"files": { ... },
|
||
"diffs": { ... },
|
||
"comments": [
|
||
{
|
||
"body": "Consider simplifying this logic.",
|
||
"file": "src/Util.java",
|
||
"from_": 42,
|
||
"to": 45
|
||
}
|
||
]
|
||
}
|
||
]</code></pre>
|
||
</details>
|
||
</section>
|
||
|
||
<h3>With Context (Optional)</h3>
|
||
<p>
|
||
You can choose to download the dataset with full repository context — the state of the entire
|
||
codebase at the time the PR was created. This may help your model better understand the broader
|
||
project structure and dependencies outside of the changed files.
|
||
</p>
|
||
</div>
|
||
</template>
|
||
<template id="info-upload">
|
||
<h2>Upload Your Predictions</h2>
|
||
<section>
|
||
<p>
|
||
After downloading a dataset and generating your predictions for either task, you can upload your
|
||
results here to start the evaluation process.
|
||
</p>
|
||
<p>
|
||
Your uploaded JSON file must follow one of the schemas described below, depending on the selected
|
||
task. Once uploaded, the system will begin evaluating your submission. A progress bar will appear to
|
||
show how far along the evaluation is.
|
||
</p>
|
||
<p>
|
||
For <strong>Code Refinement</strong>, an id will also be displayed. This id allows you to safely
|
||
close the browser tab and later check the evaluation progress by pasting it into the <em>"Get
|
||
status of ongoing process"</em> section. More information is available in that section.
|
||
</p>
|
||
<section class="json-schemas">
|
||
<details>
|
||
<summary>
|
||
<strong>Comment Generation</strong>
|
||
</summary>
|
||
<p>
|
||
Submit a JSON object where each key is a string ID and each value is an object representing
|
||
the proposed comment for that instance. Each comment must specify the file path, the
|
||
starting line, the ending line (or null), and the comment text itself.
|
||
</p>
|
||
<p>
|
||
All fields are required and must follow the expected types exactly. The <code>from_</code>
|
||
field can be null if the comment applies to a single line.
|
||
</p>
|
||
<pre><code>
|
||
{
|
||
"1234": {
|
||
"path": "src/Main.java",
|
||
"from_": 10,
|
||
"to": 12,
|
||
"body": "Consider extracting this block into a separate method."
|
||
},
|
||
"5678": {
|
||
"path": "src/Util.java",
|
||
"from_": null,
|
||
"to": 42,
|
||
"body": "You might want to add a null check here."
|
||
}
|
||
}
|
||
</code></pre>
|
||
</details>
|
||
|
||
<details>
|
||
<summary><strong>Code Refinement</strong></summary>
|
||
<p>
|
||
Submit a JSON object where each key is a string ID, and the value is another object that
|
||
maps
|
||
file
|
||
paths (relative to the top-level directory) to the new file content.
|
||
</p>
|
||
<pre><code>
|
||
{
|
||
"1234": {
|
||
"src/Main.java": "public class Main { /* updated code */ }"
|
||
},
|
||
"5678": {
|
||
"utils/Helper.java": "public class Helper { /* improved logic */ }"
|
||
}
|
||
}
|
||
</code></pre>
|
||
|
||
<p>
|
||
Make sure your file strictly follows the expected format to avoid upload errors.
|
||
</p>
|
||
</details>
|
||
</section>
|
||
</section>
|
||
</template>
|
||
<template id="info-results">
|
||
<h2>Getting Information About Ongoing Process</h2>
|
||
<div>
|
||
<p>
|
||
After you upload your predictions for either the <strong>Comment Generation</strong> or <strong>Code
|
||
Refinement</strong> task, the evaluation begins in the background. Depending on the size of your
|
||
submission and how busy the system is, this process may take some time. This is especially true for
|
||
the code refinement task, which involves compiling and testing the submitted code.
|
||
</p>
|
||
|
||
<p>
|
||
To help you track progress, the system assigns a unique <strong>process ID</strong> to every
|
||
submission. This ID will be shown to you right after the upload is complete. Be sure to save or copy
|
||
this ID, as it is the only way to check on the status of your evaluation if you close or refresh the
|
||
page.
|
||
</p>
|
||
|
||
<p>
|
||
To view the current status of a submission, enter your process ID in the field labeled
|
||
<strong>"Process id"</strong> in this section and click the <strong>"Request status"</strong>
|
||
button. The system will respond by showing whether your evaluation is still running, has finished
|
||
successfully, or encountered an error.
|
||
</p>
|
||
|
||
<p>
|
||
When the process is complete, the results will automatically appear in a summary table in place of
|
||
the progress bar. You will also see an option to download the results as a JSON file. The JSON
|
||
includes more detailed information than what is shown in the table, such as exact evaluation scores,
|
||
file-level data, and other metadata that may be useful for in-depth analysis.
|
||
</p>
|
||
|
||
<p>
|
||
If the process is still running, you can come back at any time and use the same ID to check the
|
||
status again. The evaluation runs entirely in the background on our servers, so you do not need to
|
||
keep the page open. You can safely close the browser, shut down your computer, or return on a
|
||
different device. As long as you have your process ID, you will be able to retrieve your results
|
||
later.
|
||
</p>
|
||
|
||
<p>
|
||
<strong>Important:</strong> If you lose or forget your process ID, you will not be able to retrieve
|
||
your results. In that case, you would need to reupload your predictions and start a new evaluation.
|
||
For this reason, we recommend saving the process ID as soon as it is displayed.
|
||
</p>
|
||
</div>
|
||
</template>
|
||
</main>
|
||
</body>
|
||
|
||
</html>
|