There is a growing interest in enhancing compiler optimizations with ML models, yet interactions between compilers and ML frameworks remain challenging. Some optimizations require tightly coupled models and compiler internals, raising issues with modularity, performance and framework independence. Practical deployment and transparency for the end-user are also important concerns. We propose ML-Compiler-Bridge to enable ML model development within a traditional Python framework while making end-to-end integration with an optimizing compiler possible and efficient. We evaluate it on both research and production use cases, for training and inference, over several optimization problems, multiple compilers and its versions, and gym infrastructures.
2023
CC
RL4ReAl: Reinforcement Learning for Register Allocation
S. VenkataKeerthy, Siddharth Jain , Anilava Kundu , and 3 more authors
In Proceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction , 2023
We aim to automate decades of research and experience in register allocation, leveraging machine learning. We tackle this problem by embedding a multi-agent reinforcement learning algorithm within LLVM, training it with the state of the art techniques. We formalize the constraints that precisely define the problem for a given instruction-set architecture, while ensuring that the generated code preserves semantic correctness. We also develop a gRPC based framework providing a modular and efficient compiler interface for training and inference. Our approach is architecture independent: we show experimental results targeting Intel x86 and ARM AArch64. Our results match or out-perform the heavily tuned, production-grade register allocators of LLVM.
arXiv
VEXIR2Vec: An Architecture-Neutral Embedding Framework for Binary Similarity
S. VenkataKeerthy, Yashas Andaluri , Sayan Dey , and 2 more authors
We propose VEXIR2Vec, a code embedding framework for finding similar functions in binaries. Our representations rely on VEX IR, the intermediate representation used by binary analysis tools like Valgrind and angr. Our proposed embeddings encode both syntactic and semantic information to represent a function, and is both application and architecture independent. We also propose POV, a custom Peephole Optimization engine that normalizes the VEX IR for effective similarity analysis. We design several optimizations like copy/constant propagation, constant folding, common subexpression elimination and load-store elimination in POV. We evaluate our framework on two experiments – diffing and searching – involving binaries targeting different architectures, compiled using different compilers and versions, optimization sequences, and obfuscations. We show results on several standard projects and on real-world vulnerabilities. Our results show that VEXIR2Vec achieves superior precision and recall values compared to the state-of-the-art works. Our framework is highly scalable and is built as a multi-threaded, parallel library by only using open-source tools. VEXIR2Vec achieves about 3.2x speedup on the closest competitor, and orders-of-magnitude speedup on other tools.
APNET
Packet Processing Algorithm Identification using Program Embeddings
S. VenkataKeerthy, Yashas Andaluri , Sayan Dey , and 3 more authors
In Proceedings of the 6th Asia-Pacific Workshop on Networking , 2023
To keep up with the network speeds, many recent works propose to offload network functions to SmartNICs. The process involves identifying packet-processing algorithms in a network function program then offloading them to appropriate accelerators available on SmartNICs. This process is often done manually for each architecture and is error-prone and laborious. In this work, we propose an automated solution to identify algorithms in network function programs. We model our approach as a classification problem of Machine Learning (ML) and propose using sophisticated program embeddings for representing the network function programs. We also identify the limited availability of datasets and propose a way of extrapolating them by systematically generating equivalent programs using (existing) compiler transformations in popular compiler infrastructures. Our approach relies on modeling programs as embeddings, uses ML models trained on such extrapolated datasets, and shows superior results over the recent works.
2022
ISPASS
POSET-RL: Phase ordering for Optimizing Size and Execution Time using Reinforcement Learning
Shalini Jain , Yashas Andaluri , S. VenkataKeerthy, and 1 more author
In International Symposium on Performance Analysis of Systems and Software , 2022
The ever increasing memory requirements of several applications has led to increased demands which might not be met by embedded devices. Constraining the usage of memory in such cases is of paramount importance. It is important that such code size improvements should not have a negative impact on the runtime. Improving the execution time while optimizing for code size is a non-trivial but a significant task. The ordering of standard optimization sequences in modern compilers is fixed, and are heuristically created by the compiler domain experts based on their expertise. However, this ordering is sub-optimal, and does not generalize well across all the cases. We present a reinforcement learning based solution to the phase ordering problem, where the ordering improves both the execution time and code size. We propose two different approaches to model the sequences: one by manual ordering, and other based on a graph called Oz Dependence Graph (ODG). Our approach uses minimal data as training set, and is integrated with LLVM. We show results on x86 and AArch64 architectures on the benchmarks from SPEC-CPU 2006, SPEC-CPU 2017 and MiBench. We observe that the proposed model based on ODG outperforms the current Oz sequence both in terms of size and execution time by 6.19% and 11.99% in SPEC 2017 benchmarks, on an average.
LLVM-HPC
Reinforcement Learning assisted Loop Distribution for Locality and Vectorization
Shalini Jain , S. VenkataKeerthy, Rohit Aggarwal , and 3 more authors
In 2022 IEEE/ACM Eighth Workshop on the LLVM Compiler Infrastructure in HPC (LLVM-HPC) , 2022
We propose IR2VEC, a Concise and Scalable encoding infrastructure to represent programs as a distributed embedding in continuous space. This distributed embedding is obtained by combining representation learning methods with flow information to capture the syntax as well as the semantics of the input programs. As our infrastructure is based on the Intermediate Representation (IR) of the source code, obtained embeddings are both language and machine independent. The entities of the IR are modeled as relationships, and their representations are learned to form a seed embedding vocabulary. Using this infrastructure, we propose two incremental encodings: Symbolic and Flow-Aware. Symbolic encodings are obtained from the seed embedding vocabulary, and Flow-Aware encodings are obtained by augmenting the Symbolic encodings with the flow information. We show the effectiveness of our methodology on two optimization tasks (Heterogeneous device mapping and Thread coarsening). Our way of representing the programs enables us to use non-sequential models resulting in orders of magnitude of faster training time. Both the encodings generated by IR2VEC outperform the existing methods in both the tasks, even while using simple machine learning models. In particular, our results improve or match the state-of-the-art speedup in 11/14 benchmark-suites in the device mapping task across two platforms and 53/68 benchmarks in the thread coarsening task across four different platforms. When compared to the other methods, our embeddings are more scalable, is non-data-hungry, and has better Out-Of-Vocabulary (OOV) characteristics.
2019
IJESDF
Secure Gray code-based reversible data hiding scheme in radiographic images
B. Karthikeyan , S. VenkataKeerthy, and G. Hariharan
International Journal of Electronic Security and Digital Forensics, Dec 2019
Transmitting medical information through a network for the purpose of tele-diagnosis involves greater risk of losing confidentiality and integrity of the information being transmitted. This paper presents a scheme that ensures reversibility of the cover image and also makes it suitable for the field of telemedicine. The methodology uses cryptographic and the steganographic methods. The proposed work decreases the overhead by reducing the size of the auxiliary data to be embedded which is used to achieve the reversibility of the cover image. The proposed method also improves security of the data and enhances the image quality. The algorithm yields a reversible data hiding (RDH) scheme based on pixel value ordering (PVO). The methodology differs from other basic schemes as it uses Gray code instead of ordinary binary codes. It naturally suits for medical steganography as the carrier image can be reconstructed after extraction of the secret data and also the distortion caused due to embedding is very less. The method is also robust as one time pad cryptographic technique is used to generate the key.
2018
P4WE, ICNP
P4LLVM: An LLVM Based P4 Compiler
Tharun Kumar Dangeti* , S. VenkataKeerthy* , and Ramakrishna Upadrasta
In P4WE workshop, International Conference on Network Protocols (ICNP) , Dec 2018
We propose P4LLVM, an LLVM based P4 compiler for achieving better optimizations to improve the runtime performance of the network. The front-end of P4LLVM converts P4-16’s code to LLVM’s Intermediate Representation (IR). This IR is passed through various optimizations of LLVM and is translated to JSON for targeting a BMV2 Switch. We show the performance improvements obtained by running LLVM optimization passes in P4LLVM when compared to P4C.
IoTSMS
INSTRUCT: A Clustering Based Identification of Valid Communications in IoT Networks
Mohd Saalim Jamal , S. VenkataKeerthy, Hideya. Ochiai , and 2 more authors
In International Conference on Internet of Things: Systems, Management and Security , Dec 2018
Providing access control to the IoT devices is an essential task in today’s ever-growing IoT network. IoT devices are deployed in smart homes, smart buildings, social infrastructures etc. Illegitimate users or malware should be denied access to these devices to protect the sensitive information collected by these devices and the login privileges of its operating system. This paper proposes INSTRUCT, a mechanism for providing access control by identifying valid communication in a network consisting of IoT devices using clustering techniques. INSTRUCT uses the fact that the IoT devices usually communicate with a fixed set of hosts/servers repetitively. By capturing the network traffic and learning the patterns out of the network traffic, this mechanism allows the automatic generation of access control list that can be deployed at the intermediate network switches. INSTRUCT proposes two different algorithms for TCP and UDP respectively. These algorithms are applied to two different IoT networks for evaluation. A signature-based manual analysis is used to compare with the automatically generated access control list from the algorithms. In our experiments, INSTRUCT achieved an accuracy of 100% as compared to the signature based analysis in identifying valid TCP communication. In the case of UDP, it is close to 95%.
2015
ICIIECS
A hybrid technique for quadrant based data hiding using Huffman coding
S. VenkataKeerthy, T. K. C. Rhishi Kishore , B. Karthikeyan , and 2 more authors
In International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS) , Dec 2015
The paper proposes a robust steganography technique to hide the data in an image. The method proposed uses Huffman coding to minimize the number of bits to be embedded and to improve the security of the information. The security aspect is also improved by using a cryptographic substitution cipher and quadrant based embedding of the data. The quadrant based embedding of data bits helps in distribution of bits uniformly over the entire image rather having concentrated data bits over a particular region. The quality of stego image and the embedding capacity is also improved by the usage of Huffman coding. LSB embedding technique is used in the algorithm for concealing the data in the image.