Author: "Ang Jia" / Topic: computer science - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Ang Jia"' showing total 7 results

Start Over Author "Ang Jia" Topic computer science

7 results on '"Ang Jia"'

1. Interpretation-enabled Software Reuse Detection Based on a Multi-Level Birthmark Model

Author: Xi Xu, Ming Fan, Zheng Yan, Qinghua Zheng, Ang Jia, Ting Liu, Xi'an Jiaotong University, Network Security and Trust, Department of Communications and Networking, Aalto-yliopisto, and Aalto University
Subjects: FOS: Computer and information sciences, Source code, business.industry, Computer science, Semantics (computer science), media_common.quotation_subject, Process (computing), Software development, 020207 software engineering, 02 engineering and technology, Reuse, computer.software_genre, Obfuscation (software), Software Engineering (cs.SE), Computer Science - Software Engineering, Software, 020204 information systems, Basic block, 0202 electrical engineering, electronic engineering, information engineering, Data mining, business, computer, media_common
Abstract: Software reuse, especially partial reuse, poses legal and security threats to software development. Since its source codes are usually unavailable, software reuse is hard to be detected with interpretation. On the other hand, current approaches suffer from poor detection accuracy and efficiency, far from satisfying practical demands. To tackle these problems, in this paper, we propose \textit{ISRD}, an interpretation-enabled software reuse detection approach based on a multi-level birthmark model that contains function level, basic block level, and instruction level. To overcome obfuscation caused by cross-compilation, we represent function semantics with Minimum Branch Path (MBP) and perform normalization to extract core semantics of instructions. For efficiently detecting reused functions, a process for "intent search based on anchor recognition" is designed to speed up reuse detection. It uses strict instruction match and identical library call invocation check to find anchor functions (in short anchors) and then traverses neighbors of the anchors to explore potentially matched function pairs. Extensive experiments based on two real-world binary datasets reveal that \textit{ISRD} is interpretable, effective, and efficient, which achieves $97.2\%$ precision and $94.8\%$ recall. Moreover, it is resilient to cross-compilation, outperforming state-of-the-art approaches.
Published: 2021
Full Text: View/download PDF

2. Interpretation Area-Guided Detection of Adversarial Samples

Author: Zhou Xu, JiaLi Wei, Ming Fan, Ang Jia, Lei Xue, and Xi Xu
Subjects: Adversarial system, Computer science, Feature (computer vision), business.industry, Deep learning, Pattern recognition, Artificial intelligence, business, Computer Science::Cryptography and Security, Interpretation (model theory), Image (mathematics)
Abstract: Deep learning systems are known to be vulnerable to adversarial samples, which are implemented to change the prediction results by adding small perturbations to benign samples. It is significant to defend against an adversarial attack in critical fields such as automatic drive. In this paper, we propose an interpretation area-guided detection method of adversarial samples, which can improve the performance of the typical feature squeezing method by combining the generated interpretation results. Specifically, we divide the input image into two main parts, the interpretation part, and the non-interpretation part. Then we only squeeze the non-interpretation part, which can reduce the side-effect for benign samples. We evaluate our approach on two widely used datasets, and the results demonstrate that our approach outperforms the original feature squeezing method.
Published: 2020
Full Text: View/download PDF

3. When representation learning meets software analysis

Author: Jingwen Liu, Ang Jia, Ming Fan, Ting Liu, and Wei Chen
Subjects: Downstream (software development), business.industry, Computer science, Deep learning, media_common.quotation_subject, Semantics, Field (computer science), Promotion (rank), Code (cryptography), Artificial intelligence, Software analysis pattern, business, Software engineering, Feature learning, media_common
Abstract: In recent years, deep learning is increasingly prevalent in the field of Software Engineering (SE). Especially, representation learning, which can learn vectors from the syntactic and semantics of the code, offers much convenience and promotion for the downstream tasks such as code search and vulnerability detection. In this work, we introduce our two applications of leveraging representation learning for software analysis, including defect prediction and vulnerability detection.
Published: 2020
Full Text: View/download PDF

4. From Innovations to Prospects

Author: Ming Fan, Wenying Wei, Ang Jia, Ting Liu, Kai Ye, Di Cui, Zijiang Yang, and Xi Xu
Subjects: FOS: Computer and information sciences, Market capitalization, Cryptocurrency, Source code, Computer science, media_common.quotation_subject, Construct (python library), Data science, Variety (cybernetics), Software Engineering (cs.SE), Computer Science - Software Engineering, Empirical research, Digital currency, Code (cryptography), media_common
Abstract: The great influence of Bitcoin has promoted the rapid development of blockchain-based digital currencies, especially the altcoins, since 2013. However, most altcoins share similar source codes, resulting in concerns about code innovations. In this paper, an empirical study on existing altcoins is carried out to offer a thorough understanding of various aspects associated with altcoin innovations. Firstly, we construct the dataset of altcoins, including source code repositories, GitHub fork relations, and market capitalizations (cap). Then, we analyze the altcoin innovations from the perspective of source code similarities. The results demonstrate that more than 85% of altcoin repositories present high code similarities. Next, a temporal clustering algorithm is proposed to mine the inheritance relationship among various altcoins. The family pedigrees of altcoin are constructed, in which the altcoin presents similar evolution features as biology, such as power-law in family size, variety in family evolution, etc. Finally, we investigate the correlation between code innovations and market capitalization. Although we fail to predict the price of altcoins based on their code similarities, the results show that altcoins with higher innovations reflect better market prospects., Comment: 10 pages
Published: 2020
Full Text: View/download PDF

5. Revisiting the Challenges and Opportunities in Software Plagiarism Detection

Author: Ming Fan, Ang Jia, Qinghua Zheng, Zheng Yan, Ting Liu, Xi Xu, Yin Wang, Kontogiannis, Kostas, Khomh, Foutse, Chatzigeorgiou, Alexander, Fokaefs, Marios-Eleftherios, Zhou, Minghui, Ministry of Education, China, Department of Communications and Networking, Aalto-yliopisto, and Aalto University
Subjects: 021110 strategic, defence & security studies, Computer science, Process (engineering), business.industry, 0211 other engineering and technologies, ComputingMilieux_LEGALASPECTSOFCOMPUTING, 020207 software engineering, 02 engineering and technology, source code similarity, binary code similarity, Field (computer science), Software, System call, software birthmark, Scalability, 0202 electrical engineering, electronic engineering, information engineering, Key (cryptography), Plagiarism detection, software plagiarism detection, Software engineering, business, Interpretability
Abstract: Software plagiarism seriously impedes the healthy development of open source software. To fight against code obfuscation and inherent non-determinism of thread scheduling applied against software plagiarism detection, we proposed a new dynamic birthmark called DYnamic Key Instruction Sequence (DYKIS) and a framework called Thread-oblivious dynamic Birthmark (TOB) for the purpose of reviving the existing birthmarks and a thread-aware dynamic birthmark called Thread-related System call Birthmark (TreSB). Though many approaches have been proposed for software plagiarism detection, they are still limited to satisfy the following highly desired requirements: the applicability to handle binary, the capability to detect partial plagiarism, the resiliency to code obfuscation, the interpretability on detection results, and the scalability to process large-scale software. In this position paper, we discuss and outline the research opportunities and challenges in the field of software plagiarism detection in order to stimulate brilliant innovations and direct our future research efforts.
Published: 2020
Full Text: View/download PDF

6. Benchmarking NLP Toolkits for Enterprise Application

Author: Yasaman Eftekharypour, Kok Weiying, Duc Nghia Pham, and Ang Jia Pheng
Subjects: Computer science, business.industry, Lemmatisation, Benchmarking, computer.software_genre, Data type, Task (computing), Tokenization (data security), Named-entity recognition, Segmentation, Applications of artificial intelligence, Artificial intelligence, business, computer, Natural language processing
Abstract: Natural Language Processing (NLP) is an important technology that motivates the form of AI applications today. Many NLP libraries are available for researchers and developers to perform standard NLP tasks (such as segmentation, tokenization, lemmatization, POS tagging, and NER) without the need to develop from scratch. However, there are some challenges in selecting the most suitable library such as data type, performance, and the compatibility. In this paper, we assessed five popular NLP libraries for performing the standard processing tasks on datasets crawled from different online news sources in Malaysia. The obtained results are analysed and differences of those libraries are listed. The goal of this study is to provide a clear view for users to select the suitable NLP library for their text analysis task.
Published: 2019
Full Text: View/download PDF

7. MICROCONTROLLER-BASED FOR SYSTEM IDENTIFICATION TOOLS USING LEAST SQUARE METHOD FOR RC CIRCUITS

Author: S. Yaacob, Abdul Majid, Ang Jia Yi, and Mohd Nor Azuwir
Subjects: Microcontroller, Automatic control, Computer science, SIGNAL (programming language), General Engineering, Electronic engineering, System identification, Process (computing), Experimental data, Control engineering, RC circuit, Toolbox
Abstract: System identification is one of the method to construct a plant mathematical model from experimental data. This method has been widely applied in the automatic control, aviation, spaceflight medicine, society economics and other fields more. With the rapid growth of the science and technology, the system identification technique has increasingly grown in various applications. Since most of the system identification devices are off-line base, this means that the system identification can only be done after collecting the data and process through a computer devices. This paper will show how to process system identification method with real-time system. This method required a microcontroller as the medium to perform. That’s why the system identification method will be programmed into a microcontroller, based on Least Square Method. Later, the system will be tested on a RC circuit to see the effect of the signal and the mathematical model obtained. The data will undergo the system identification toolbox for process using ARX and ARMAX model. On the other hand, the data will also be collected using the microcontroller created for analysis purpose. To ensure the validity of the model some verification methods are performed. Results show that the Least Square Method using Microcontroller base has the capability to work as a system identification tools.
Published: 2015
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

7 results on '"Ang Jia"'

1. Interpretation-enabled Software Reuse Detection Based on a Multi-Level Birthmark Model

2. Interpretation Area-Guided Detection of Adversarial Samples

3. When representation learning meets software analysis

4. From Innovations to Prospects

5. Revisiting the Challenges and Opportunities in Software Plagiarism Detection

6. Benchmarking NLP Toolkits for Enterprise Application

7. MICROCONTROLLER-BASED FOR SYSTEM IDENTIFICATION TOOLS USING LEAST SQUARE METHOD FOR RC CIRCUITS

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

7 results on '"Ang Jia"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources