Recognising formula entailment using long short-term memory network Amarnath Pathak, Partha Pakray Journal of Information Science, 2026 The article presents an approach to recognise formula entailment, which concerns finding entailment relationships between pairs of math formulae. As the current formula-similarity-detection approaches fail to account for broader relationships between pairs of math formulae, recognising formula entailment becomes paramount. To this end, a long short-term memory (LSTM) neural network using symbol-by-symbol attention for recognising formula entailment is implemented. However, owing to the unavailability of relevant training and validation corpora, the first and foremost step is to create a sufficiently large-sized symbol-level MATHENTAIL data set in an automated fashion. Depending on the extent of similarity between the corresponding symbol embeddings, the symbol pairs in the MATHENTAIL data set are assigned ‘entailment’ or ‘neutral’ labels. An improved symbol-to-vector (isymbol2vec) method generates mathematical symbols (in LATEX) and their embeddings using the Wikipedia corpus of scientific documents and Continuous Bag of Words (CBOW) architecture. Eventually, the LSTM network, trained and validated using the MATHENTAIL data set, predicts formulae entailment for test formulae pairs with a reasonable accuracy of 62.2%.
Neural machine translation for Indian languages Amarnath Pathak, Partha Pakray Journal of Intelligent Systems, 2019 Machine Translation bridges communication barriers and eases interaction among people having different linguistic backgrounds. Machine Translation mechanisms exploit a range of techniques and linguistic resources for translation prediction. Neural machine translation (NMT), in particular, seeks optimality in translation through training of neural network, using a parallel corpus having a considerable number of instances in the form of a parallel running source and target sentences. Easy availability of parallel corpora for major Indian language forms and the ability of NMT systems to better analyze context and produce fluent translation make NMT a prominent choice for the translation of Indian languages. We have trained, tested, and analyzed NMT systems for English to Tamil, English to Hindi, and English to Punjabi translations. Predicted translations have been evaluated using Bilingual Evaluation Understudy and by human evaluators to assess the quality of translation in terms of its adequacy, fluency, and correspondence with human-predicted translation.
LSTM neural network based math information retrieval Amarnath Pathak, Partha Pakray, Ranjita Das 2019 2nd International Conference on Advanced Computational and Communication Paradigms Icaccp 2019, 2019 The work presented in this paper ascertains role of Long Sort-Term Memory (LSTM) neural network in Math Information Retrieval (MIR). Motivated from promising performances of the LSTM for sequence-to-sequence tasks, an LSTM based Formula Entailment (LFE) module is implemented for recognizing entailment between mathematical user query and document formulae. The LFE module is trained and validated using a symbol level Math Formula Entailment (MENTAIL) dataset. The relevance of a document is determined by the fraction of document formulae which entail the user query. A reasonable score of 0.45 for the P_5 evaluation measure substantiates competence of the implemented MIR system in retrieving relevant documents corresponding to a mathematical user query.
Binary vector transformation of math formula for mathematical information retrieval Amarnath Pathak, Partha Pakray, Alexander Gelbukh Journal of Intelligent and Fuzzy Systems, 2019 Scientific documents, which are majorly constituted of math formulae, form a primary source of scientific and technical information. However, the indexing and the search processes of conventional search engines barely account for mathematical contents of such documents. Though the recent past has witnessed a surge in number of Mathematical Information Retrieval (MIR) systems intending to retrieve math formulae from scientific documents, the low values of their evaluation measures are indicative of the scope for improvement. To cope with the challenges of MIR, and to further the performance of state-of-the-art systems, a novel approach, called Binary Vector Transformation of Math Formula (BVTMF), is introduced. The implemented system extracts MathML formulae from the documents, preprocesses them, and renders them into fairly large-sized binary vectors (vectors of ‘0’s and ‘1’s). Generated formula vector is representative of the information content of corresponding formula. For indexing and searching text contents, the system relies on Apache Lucene. Text and math search results retrieved by independent text and math sub-systems are re-ranked to prioritize the results containing text as well as math components of the user query. Quality of the retrieved search results and appreciable values of the evaluation measures substantiate competence of the proposed approach.
Extracting context of math formulae contained inside scientific documents Amarnath Pathak, Ranjita Das, Partha Pakray, Alexander Gelbukh Computacion Y Sistemas, 2019 A math formula present inside a scientific document is often preceded by its textual description, which is commonly referred to as the context of formula. Annotating context to the formula enriches its semantics, and consequently impacts the retrieval of mathematical contents from scientific documents. Also, with a considerable surety, a context can be assumed to be one of the Noun Phrases (NPs) of the sentence in which formula occurs. However, the presence of several different misleading NPs in the sentence necessitates extraction of an NP, which is more precise to the formula than the rest. Although a fair number of methods are developed for precise context extraction, it can be fascinating to prospect other competent techniques which can further their performances. To this end, this paper discusses implementation of an automated context extraction system, which follows certain heuristics in assigning weights to different candidate NPs, and tune those weights using a development set comprising annotated formulae. The implemented system significantly outperforms nearest noun and sentence–pattern based methods on the ground of F–score.
Recognising formula entailment using long short-term memory network A Pathak, P Pakray Journal of Information Science 52 (1), 214-227 , 2026 2026
MathIRs: A One-Stop Solution to Several Mathematical Information Retrieval Needs A Pathak, P Pakray, R Das Proceedings of International Conference on Frontiers in Computing and … , 2022 2022
MathIRs: A One-Stop Solution to Several A Pathak, P Pakray, R Das Proceedings of International Conference on Frontiers in Computing and … , 2022 2022
Scientific text entailment and a textual-entailment-based framework for cooking domain question answering A Pathak, R Manna, P Pakray, D Das, A Gelbukh, S Bandyopadhyay Sādhanā 46 (1), 24 , 2021 2021 Citations: 11
Context guided retrieval of math formulae from scientific documents A Pathak, P Pakray, R Das Journal of Information and Optimization Sciences 40 (8), 1559-1574 , 2019 2019 Citations: 11
English–mizo machine translation using neural and statistical approaches A Pathak, P Pakray, J Bentham Neural Computing and Applications 31 (11), 7615-7631 , 2019 2019 Citations: 73
Extracting context of math formulae contained inside scientific documents A Pathak, R Das, P Pakray, A Gelbukh Computación y Sistemas 23 (3), 803-818 , 2019 2019 Citations: 5
Binary vector transformation of math formula for mathematical information retrieval A Pathak, P Pakray, A Gelbukh Journal of Intelligent & Fuzzy Systems, 1-11 , 2019 2019 Citations: 17
LSTM neural network based math information retrieval A Pathak, P Pakray, R Das 2019 Second International Conference on Advanced Computational and … , 2019 2019 Citations: 23
A formula embedding approach to math information retrieval A Pathak, P Pakray, A Gelbukh Computación y Sistemas 22 (3), 819-833 , 2018 2018 Citations: 23
An Improved and Intelligent Boolean Model for Scientific Text Information Retrieval A Pathak, P Pakray Communications in Computer and Information Science (CCIS), Springer 836, 465-476 , 2018 2018 Citations: 3
Mining Fuzzy Classification Rules with Exceptions: A Comparative Study A Pathak, D Goel, S Debnath Proceedings of the International Conference on Computing and Communication … , 2018 2018
An HMM Based POS Tagger for POS Tagging of Code-Mixed Indian Social Media Text P Pakray, G Majumder, A Pathak Communications in Computer and Information Science (CCIS), Springer 836, 495-504 , 2018 2018 Citations: 9
Neural Machine Translation for Indian Languages A Pathak, P Pakray Journal of Intelligent Systems , 2018 2018 Citations: 67
Exception discovery using ant colony optimisation S Ratnoo, A Pathak, J Ahuja, J Vashishtha International Journal of Computational Systems Engineering 4 (1), 46-57 , 2018 2018 Citations: 4
A STUDY ON MINING FUZZY CLASSIFICATION RULES WITH EXCEPTIONS S Debnath, A Pathak 2018
Mathirs: Retrieval system for scientific documents A Pathak, P Pakray, S Sarkar, D Das, A Gelbukh Computación y Sistemas 21 (2), 253-265 , 2017 2017 Citations: 21
Classification rule and exception mining using nature inspired algorithms A Pathak, J Vashistha International Journal of Computer Science and Information Technologies 6 (3 … , 2015 2015 Citations: 13
MOST CITED SCHOLAR PUBLICATIONS
English–mizo machine translation using neural and statistical approaches A Pathak, P Pakray, J Bentham Neural Computing and Applications 31 (11), 7615-7631 , 2019 2019 Citations: 73
Neural Machine Translation for Indian Languages A Pathak, P Pakray Journal of Intelligent Systems , 2018 2018 Citations: 67
LSTM neural network based math information retrieval A Pathak, P Pakray, R Das 2019 Second International Conference on Advanced Computational and … , 2019 2019 Citations: 23
A formula embedding approach to math information retrieval A Pathak, P Pakray, A Gelbukh Computación y Sistemas 22 (3), 819-833 , 2018 2018 Citations: 23
Mathirs: Retrieval system for scientific documents A Pathak, P Pakray, S Sarkar, D Das, A Gelbukh Computación y Sistemas 21 (2), 253-265 , 2017 2017 Citations: 21
Binary vector transformation of math formula for mathematical information retrieval A Pathak, P Pakray, A Gelbukh Journal of Intelligent & Fuzzy Systems, 1-11 , 2019 2019 Citations: 17
Classification rule and exception mining using nature inspired algorithms A Pathak, J Vashistha International Journal of Computer Science and Information Technologies 6 (3 … , 2015 2015 Citations: 13
Scientific text entailment and a textual-entailment-based framework for cooking domain question answering A Pathak, R Manna, P Pakray, D Das, A Gelbukh, S Bandyopadhyay Sādhanā 46 (1), 24 , 2021 2021 Citations: 11
Context guided retrieval of math formulae from scientific documents A Pathak, P Pakray, R Das Journal of Information and Optimization Sciences 40 (8), 1559-1574 , 2019 2019 Citations: 11
An HMM Based POS Tagger for POS Tagging of Code-Mixed Indian Social Media Text P Pakray, G Majumder, A Pathak Communications in Computer and Information Science (CCIS), Springer 836, 495-504 , 2018 2018 Citations: 9
Extracting context of math formulae contained inside scientific documents A Pathak, R Das, P Pakray, A Gelbukh Computación y Sistemas 23 (3), 803-818 , 2019 2019 Citations: 5
Exception discovery using ant colony optimisation S Ratnoo, A Pathak, J Ahuja, J Vashishtha International Journal of Computational Systems Engineering 4 (1), 46-57 , 2018 2018 Citations: 4
An Improved and Intelligent Boolean Model for Scientific Text Information Retrieval A Pathak, P Pakray Communications in Computer and Information Science (CCIS), Springer 836, 465-476 , 2018 2018 Citations: 3
Recognising formula entailment using long short-term memory network A Pathak, P Pakray Journal of Information Science 52 (1), 214-227 , 2026 2026
MathIRs: A One-Stop Solution to Several Mathematical Information Retrieval Needs A Pathak, P Pakray, R Das Proceedings of International Conference on Frontiers in Computing and … , 2022 2022
MathIRs: A One-Stop Solution to Several A Pathak, P Pakray, R Das Proceedings of International Conference on Frontiers in Computing and … , 2022 2022
Mining Fuzzy Classification Rules with Exceptions: A Comparative Study A Pathak, D Goel, S Debnath Proceedings of the International Conference on Computing and Communication … , 2018 2018
A STUDY ON MINING FUZZY CLASSIFICATION RULES WITH EXCEPTIONS S Debnath, A Pathak 2018