Header

UZH-Logo

Maintenance Infos

PathMiner: A Library for Mining of Path-Based Representations of Code


Kovalenko, Vladimir; Bogomolov, Egor; Bryksin, Timofey; Bacchelli, Alberto (2019). PathMiner: A Library for Mining of Path-Based Representations of Code. In: 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), Montreal, QC, Canada, 25 June 2019 - 1 July 2019, 13-17.

Abstract

One recent, significant advance in modeling source code for machine learning algorithms has been the introduction of path-based representation - an approach consisting in representing a snippet of code as a collection of paths from its syntax tree. Such representation efficiently captures the structure of code, which, in turn, carries its semantics and other information. Building the path-based representation involves parsing the code and extracting the paths from its syntax tree; these steps build up to a substantial technical job. With no common reusable toolkit existing for this task, the burden of mining diverts the focus of researchers from the essential work and hinders newcomers in the field of machine learning on code. In this paper, we present PathMiner - an open-source library for mining path-based representations of code. PathMiner is fast, flexible, well-tested, and easily extensible to support input code in any common programming language. Preprint [https://doi.org/10.5281/zenodo.2595271]; released tool [https://doi.org/10.5281/zenodo.2595257].

Abstract

One recent, significant advance in modeling source code for machine learning algorithms has been the introduction of path-based representation - an approach consisting in representing a snippet of code as a collection of paths from its syntax tree. Such representation efficiently captures the structure of code, which, in turn, carries its semantics and other information. Building the path-based representation involves parsing the code and extracting the paths from its syntax tree; these steps build up to a substantial technical job. With no common reusable toolkit existing for this task, the burden of mining diverts the focus of researchers from the essential work and hinders newcomers in the field of machine learning on code. In this paper, we present PathMiner - an open-source library for mining path-based representations of code. PathMiner is fast, flexible, well-tested, and easily extensible to support input code in any common programming language. Preprint [https://doi.org/10.5281/zenodo.2595271]; released tool [https://doi.org/10.5281/zenodo.2595257].

Statistics

Citations

Dimensions.ai Metrics

Altmetrics

Downloads

4 downloads since deposited on 26 Jan 2021
4 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Conference or Workshop Item (Paper), refereed, original work
Communities & Collections:03 Faculty of Economics > Department of Informatics
Dewey Decimal Classification:000 Computer science, knowledge & systems
Scopus Subject Areas:Physical Sciences > Computer Science Applications
Physical Sciences > Software
Language:English
Event End Date:1 July 2019
Deposited On:26 Jan 2021 10:38
Last Modified:27 Jan 2021 21:02
Publisher:IEEE
ISBN:978-1-7281-3412-3
OA Status:Green
Publisher DOI:https://doi.org/10.1109/MSR.2019.00013
Other Identification Number:merlin-id:20229

Download

Green Open Access

Download PDF  'PathMiner: A Library for Mining of Path-Based Representations of Code'.
Preview
Content: Accepted Version
Filetype: PDF
Size: 514kB
View at publisher