The CommentsMiner is the core object of SoCCMiner that gets populated in all the pipelines. It has the following options,
source_url: Refers to the github repository url or local directory location for mining. For loading, it always refers to the local directory containing the mined JSON data.
lang: Refers to the programming language of the source code. Default value is ‘java’ now.
m_level: Refers to the mining level, can be one of ‘comment’, ‘comprehensive_comment’, ‘project’, and ‘all’. Default value is ‘comment’.
direct_load: Set to True for loading the already serialized data else False. By default, it is set to False.
log: Used for debugging. It can be ‘DEBUG’, ‘INFO’ or ‘NOLOG’. By default, log will not be generated. Must be set to either ‘INFO’ for brief trace information or ‘DEBUG’ for elaborate execution trace. The log file will be generated in the current working directory.
output_dir: Used to set the local directory where the mined data should be serialized. By default, the serialized attributes will be available in the current working directory under the directory ‘SoCCMiner_Mined_Entities’.
mode: Refers to the mode with which SoCCMiner mines the source code. It can be ‘single’ that indicate the source_url contains only one project directory for mining or ‘multiple’ to indicated the source_url contains multiple project repositories as subdirectories under the source_url. If this option is not set to ‘multiple’ where there are multiple project repositories as sub-directories in the source_url, then SoCCMiner will treat all the projects as a single project.
Granular Comments: The granular comments containing the three basic comment attributes comment content, line number at which the comment is located in the source file and the source file in the the comment is located.
1forprojincm.fetch_mined_comments():# mined_proj_obj_list2print("Entire project Comments count: {}".format(len(proj.get_comments())))# for project level comments3print("File Level Comments count: {}".format(len(proj.get_file_level_comments())))# for file level comments4print("Class Level Comments count: {}".format(len(proj.get_class_level_comments())))# for class level comments5print("Enum Level Comments count: {}".format(len(proj.get_enum_level_comments())))# for enum level comments6print("Method Level Comments count: {}".format(len(proj.get_method_level_comments())))# for method level comments7print("Interface Level Comments count: {}".format(len(proj.get_interface_level_comments())))# for interface level comments8print("Static Block Level Comments count: {}".format(len(proj.get_static_block_level_comments())))# for static block level comments
Basic Comment Attributes: The three basic comment attributes of the CommentMetaAttributes pipeline can be fetched as:
1forprojincm.fetch_mined_comments():# mined_proj_obj_list2# fetch all comments basic info3# get_comments() fetches all comments, i.e., for all the entities4forbasic_comment_attr_objinproj.get_comments():5print("Comment content: {}".format(basic_comment_attr_obj.comment_text))6print("Comment line #: {}".format(basic_comment_attr_obj.comment_line_no))7print("Comment source file: {}".format(basic_comment_attr_obj.file_name))
CommentsMiner
The CommentsMiner is the core object of SoCCMiner that gets populated in all the pipelines. It has the following options,
CommentsMetaAttribute Pipeline
fetch_mined_comments(): This method fetches the mined projects at the basic “comment” level.
Basic project attributes: The basic project attributes of the mined projects can be retrieved as the following:
Granular Comments: The granular comments containing the three basic comment attributes comment content, line number at which the comment is located in the source file and the source file in the the comment is located.
Basic Comment Attributes: The three basic comment attributes of the CommentMetaAttributes pipeline can be fetched as:
ComprehensiveCommentsAttribute Pipeline
fetch_mined_comment_attributes(): This method fetches the mined projects at the basic “comprehensive_comment” level.
Basic project attributes: The basic project attributes of the mined projects can be retrieved as the following:
Granular Comments: The granular comments containing the comprehensive comment attributes can be fetched as:
Comprehensive Comment Attributes: The seventeen comprehensive comment attributes of the ComprehensiveCommentsAttribute pipeline can be fetched as:
JavaMetaAttribute Pipeline
fetch_mined_project_meta(): This method fetches the mined projects metadata mined at the “project” level.
Basic project attributes: The basic project attributes of the mined projects can be retrieved as the following:
Project Meta Attributes: The thirty project meta attributes of the JavaMetaAttribute pipeline can be fetched as:
JavaMiner Pipeline
fetch_mined_project_meta_and_comments(): This method fetches the mined all the attributes mined at “all” mining level.
Basic project attributes: The basic project attributes of the mined projects can be retrieved as the following:
Granular Comments: All the attributes discussed in the previous mining levels are available in the “all” level. For example,
Project MetaAttributes: All the project attributes of the JavaMetaAttribute pipeline can be retrieved, for example: