Software for nuclear magnetic resonance (NMR) spectrometers offer general functionality of instrument control and data processing; these applications are often developed with non-scripting languages. NMR users need to flexibly integrate rapidly developing NMR applications with emerging technologies. Scripting systems offer open environments for NMR users to write custom programs. However, existing scripting systems have limited capabilities for both extending the functionality of NMR software’s non-script main program and using advanced native script libraries to support specialized application domains (e.g., biomacromolecules and metabolomics). Therefore, it is essential to design a novel scripting system to address both of these needs.
Here, a novel NMR scripting system named SpinSPJ is proposed. It works as a plug-in in the Java based NMR spectrometer software SpinStudioJ. In the scripting system, both Java based NMR methods and original CPython based libraries are supported. A module has been developed as a bridge to integrate the runtime environments of Java and CPython. The module works as an extension in the CPython environment and interacts with Java via the Java Native Interface. Leveraging this bridge, Java based instrument control and data processing methods of SpinStudioJ can be called with the CPython style. Compared with traditional scripting systems, SpinSPJ better supports both extending the non-script main program and implementing advanced NMR applications with a rich variety of script libraries. NMR researchers can easily call functions of instrument control and data processing as well as developing complex functionality (such as multivariate statistical analysis, deep learning, etc.) with CPython native libraries.
SpinSPJ offers a user-friendly environment to implement custom functionality leveraging its powerful basic NMR and rich CPython libraries. NMR applications with emerging technologies can be easily integrated. The scripting system is free of charge and can be downloaded by visiting http://www.spinstudioj.net/spinspj.
Since its discovery in the 1940s, nuclear magnetic resonance (NMR) has been adopted in many important fields including chemistry, biology, medicine, etc. NMR software for spectrometers is an important tool to implement applications in these fields. With the expansion of NMR applications, a variety of usage scenarios have to be supported by the software. For example, a developer expects to prototype a spectral reconstruction method ; an NMR facility manager needs to handle data management; and a user wants to perform the statistical analysis  of a completed NMR experiment. To support above usage scenarios and rapidly developing NMR applications, emerging technologies have to be quickly integrated in NMR software. For instance, deep learning  has been successfully applied in non-uniform sampling , spectrum denoising , chemical shift prediction [5,6,7,8], etc. Multivariate statistical analysis  plays an important role in metabolomics [9, 10] to reveal the relationships between metabolites and significant issues such as diseases and biological processes. However, NMR users have to wait for software vendors to integrate newly developed functionality and distribute the new versions. As an alternative solution for NMR users who expect to freely implement their own customized functionality in NMR software, scripting systems offer an open environment that allows users to write scripting programs. As a module in NMR software, the scripting system can both extend the existing functionality of NMR software’s non-script main program and perform native libraries of scripting languages. In the scripting system, existing non-script functions such as instrument control and data processing of main program can be called as a script style. As emerging technologies are expected to play an increasingly important role to solve complex problems for advanced NMR applications, it is essential to enhance scripting systems’ capabilities of implementing emerging technologies and advanced applications.
However, the existing scripting systems may be difficult to use in the development of extended functionality for main programs and to support emerging technologies such as deep learning and multivariate statistics analysis. The main program of NMR software for spectrometers is often developed with a mainstream non-scripting language; these programs are typically compiled to binary code and directly executed by computer for higher execution efficiency. Scripting languages  are dynamically interpreted to machine instructions by corresponding interpreters in real time. Script programs can be freely modified and executed without recompiling. For existing scripting systems, the first type uses standard scripting languages and supports extending the functionality of main programs, but it has a more limited selection of advanced algorithms such as fast numerical computation and deep learning. The second type is better in advanced numerical computation, but it has critical disadvantages in execution efficiency  and implementing complex graphical user interfaces. These are important in the data acquisition and real-time curve display for instrument control applications. Existing NMR software of the second type focuses on data processing; they are not intended for the instrument control of spectrometers. Therefore, it is necessary to design a new scripting system, which has the capabilities to extend the functionality of the main programs (written in non-scripting languages) and rapidly implement emerging technologies by using advanced script libraries.
In this paper, a novel NMR scripting system named SpinSPJ (SpinStudioJ' Scripting system with Python and Java) is introduced. By integrating the CPython and Java programming languages, the system offers the benefits of both languages. CPython has flexible syntax features and has been widely adopted in various fields such as artificial intelligence, scientific computation, etc. It has a rich collection of libraries and a powerful ecosystem, which is significant for developing NMR advanced applications; The Java programming language  is cross platform and robust enough to implement multithreading, complex graphical user interfaces, etc. It is the development language for the main program of SpinStudioJ , which is an NMR software for spectrometers. The use of Java language is beneficial for the scripting system to interact with the main program. Therefore, SpinSPJ can extend the functionality of SpinStudioJ’s main program such as instrument control, data acquisition and data processing implemented in Java; in addition, it can rapidly adopt emerging technologies by leveraging CPython’s rich native libraries.
The proposed scripting system SpinSPJ offers a flexible scripting environment to implement custom functionality for NMR users. SpinSPJ works as a plug-in in SpinStudioJ, which is a plug-in based NMR software. SpinSPJ interacts with other plug-ins with the mechanism defined by framework OSGi (Open Service Gateway Initiative) . The relationship between SpinSPJ and SpinStudioJ is illustrated in Fig. 1.
The conventional instrument control and data processing capabilities are implemented in Java. CPython has advantages in the availability of advanced libraries for numerical computation and artificial intelligence (e.g., NumPy , TensorFlow ). The significant issue of the proposed scripting system focuses on how to build a bridge that connects Java and CPython so that both Java-based NMR methods and third-party CPython libraries are supported. CPython, developed with the C programming language, supports an extension mechanism to wrap C libraries as customized modules. The Java virtual machine provides a mechanism called the Java Native Interface (JNI)  to support interactions with the C programming language. Through the JNI, Java can call functions defined by C, and C also can access resources (e.g., classes, functions, objects) in the Java environment. Therefore, the C programming language can serve as an ideal bridge between the Java and CPython languages.
The overall architecture of SpinSPJ is illustrated in Fig. 2. According to the computer languages adopted, the entire scripting system consists of three components: Java, C, and CPython.
The Java component is responsible for the graphical user interface and provides the interfaces for CPython including the basic configuration, scripting editor, instrument control, and data processing. The basic configuration can set the location of CPython libraries. The scripting editor offers a script editing window, execution output, menus, and toolbar. The interfaces for instrument control and data processing are implemented by the OSGi which separates the abstract interface from the concrete business logic. The instrument control includes sample control, temperature control, tuning, locking, shimming, data acquisition, etc. The data processing includes Fourier transform, phase correction, baseline correction, peak picking, and integration functionality. The NMR data of a HDF5 [29, 34] based custom format organises parameters, pulse sequence, free induction decay (FID), spectrum, peak list as a hierarchical style in a single file. It can be easily read and converted to other data formats by third party analysis tools (e.g., Matlab, CPython). After processing operations selected by NMR users are performed by CPython scripts, the processed data can be written back and saved to the disk. The scripts can be set to both blocking and non-blocking modes to ensure the statements are executed in the expected sequence.
The C component is the bridge between the Java and CPython components. Through the JNI, Java can call functions in C based libraries. If necessary, the C component also can create Java objects and call Java functions. Based on the C extension mechanism of CPython, the C component can define customized modules for the CPython component. In the CPython environment, customized modules can be compiled and linked with basic CPython libraries. Two significant methods used to accomplish these are the SetPythonHome method (set the location of the CPython libraries) and the PyRun_String (execute scripted code). A Java object can be wrapped as a PyObject, which is the foundation of the CPython language. As Java based resources can be accessed by CPython, the environments of Java and CPython are connected by the C component. In SpinSPJ, the C component creates a C extension module of CPython and builds the Java-CPython bridge with the help of an open source library called Jep .
The CPython component can define customized initialization and import methods for packages and modules, as well as offering various native libraries (e.g., NumPy, SciPy, TensorFlow). The initialization and import are the significant steps which enable the interactions between CPython and Java. For native libraries, NumPy and SciPy are usually used for fast numerical operations and scientific calculations. TensorFlow is widely used for deep learning. Non-uniform sampling and chemical shift prediction methods developed by deep learning can be easily integrated into the scripting system.
The workflow of SpinSPJ explains how the internal components work from the perspective of a time series. It contains the sequential actions and interactions of the components Java, C, and CPython during different stages. The workflow consists of three main stages: initialization, execution, and exiting.
The initialization stage is mainly for the preparation of the scripting environment. In this stage, the scripting system first configures the location of the CPython libraries. Secondly, CPython installs an importer hook and inserts it into the sys.meta_path. The importer hook defines the methods find_module and load_module to tell CPython how to find and load Java packages. Therefore, the resources of Java and CPython have been connected and the scripting environment has been established in this stage.
In the execution stage, the scripting system controls the three components to support grammar features and resource accessibility. The workflow of the stage is illustrated in Fig. 3. There are two significant issues in this stage: import and interpretation. Different from conventional CPython environments, import statements in SpinSPJ can be used to import Java packages. When an import statement is called, the scripting system searches for the expected package from the variable sys.modules. If the package is found, it indicates that the package has been loaded by CPython; otherwise, the CPython environment finds the importer hook to invoke the find_module and load_module methods to load the spinspj module. The spinspj module is used to interact with Java resources. The spinspj module has a method __getattr__ to define its submodules for packages and classes in Java environment. The methods (wrapped to PyJMethod) and fields (wrapped to PyJField) of Java objects are wrapped as the attributes of a CPython object. CPython allows the PyJMethod to implement custom execution by defining the attribute tp_call of PyTypeObject, and allows PyJField to implement custom getting and setting styles by defining the attributes tp_getattro and tp_setattro of the PyTypeObject. An example is illustrated in Fig. 3. When the NMR command go is executed in scripts, PyJMethod invokes corresponding Java method by the JNI. Therefore, the CPython interpreter can recognize Java objects as conventional native CPython objects, as well as calling Java methods freely.
In the exiting stage, exception and memory management are the significant issues to ensure the scripting system is stable and robust. The JNI allows the C component to throw C based exceptions to the Java component. The Java component catches exceptions and back traces. For memory management, the Java component can reclaim memory at runtime by automatically leveraging the garbage collection feature of the Java virtual machine, so there is no need to release memory manually. However, the C component must release memory manually. Both the JNI and the C extension mechanism offer corresponding methods to release memory in order to avoid memory leaks.
The proposed SpinSPJ scripting system provides a familiar graphical user interface layout for the user, which makes it straightforward to use. A screenshot of the NMR scripting editor is illustrated in Fig. 4. The menus and tools offer not only conventional functionality for file access and text editing, but also example scripts and help manuals. All the function and parameter definitions are described in the help manuals. In addition, commands for starting and stopping to run the scripts are available both in menus and toolbar. For script editing area, the key words of CPython can be marked as a highlighted style. Code comments and strings can be respectively displayed as green and blue. When NMR users enter the code spinspj., the available fields and methods of the spinspj module are displayed as prompts. When there is only one available option, the statement is automatically completed. The bottom region of the interface can show the outputs during the execution of scripts. The reported errors, warnings, and exceptions can prompt users to deal with problems during the execution of script programs. The script file (e.g., file name is “xxxx.py”) can be executed by entering the file name (e.g., “xxxx”) of the script in the command line of NMR software SpinStudioJ.
The scripts offer general functionality such as instrument control, data processing as well as native CPython libraries. Typical scripting functions are described as Table 1.
Instrument control is used to control the NMR spectrometer’s physical components such as auto sample changer, temperature control, shimming, and locking units. For data acquisition, scripts are allowed to set the parameters of the workspace, and then start the acquisition command. Both the blocking and non-blocking modes are supported. In the blocking mode, the script waits for the completion of the invoked method until the maximum time is exhausted. In the non-blocking mode, the script invokes the method and doesn’t wait for the completion of its execution.
For data processing, scripts can call conventional processing methods such as linear prediction, Fourier transform, phase correction, baseline correction, etc. Scripts can access an FID or spectrum from a workspace, as well as updating the spectrum display after a transformation or analysis.
In order to achieve powerful performance, most native libraries of CPython can be used in the scripting system, including NumPy, SciPy, matplotlib, and TensorFlow. NumPy supports a user-friendly and efficient numerical manipulation. Users can read data (FID or spectrum) from a workspace and convert the data format to an ndarray or matrix; these are the basic data formats for fast numerical calculations in Numpy. SciPy is a scientific computation library which can be used in parameter optimization and data denoising. Matplotlib library is a visualization library for FID or spectrum plotting. TensorFlow can be used to implement deep learning which is an emerging field in NMR.
To illustrate the performance of the scripting system, three examples of NMR methods are presented. The first script example for automatic searching for shimming is illustrated in Fig. 5. Automatic searching for shimming aims to calibrate output values of instrumental shim power supply by using a multivariate optimization algorithm, which can significantly improve the homogeneity of a center magnetic field. It needs real-time data acquisition and optimization analysis with an alternating style. The proposed SpinSPJ scripting system offers Java based instrument control for setting shim values and real-time data acquisition. The SciPy library provides the optimization algorithm Simplex  to generate new shim values with an optimized search path. As illustrated in Fig. 5, the experiment of searching for shimming has been performed on a Zhongke-Niujin 500 MHz QOnePlus NMR spectrometer. The test sample is 0.1 mg/ml GdCl3 in D2O with 1% H2O. The evaluation criterion of field homogeneity is the area of FID, which can be affected by adjusting the shim values. In each iteration, Simplex can simultaneously change all of the Z1-Z4 shim values and seek their best combination by evaluating the area of the FID. After 40 iterations, the area of the FID is optimized from 350,781 to 864,282 (ADC value), and the half peak width of H2O peak is optimized from 15.8 to 3.7 Hz, which presents a substantial improvement on the magnetic field homogeneity. This example demonstrates that the scripting system can be used to flexibly develop user defined NMR methods for the instrument calibration by combining data acquisition and CPython based optimization algorithms.
The second example is for a principal component analysis (PCA)  of metabolomic data. A PCA is the significant step to distinguish among the bulk data by projecting them onto multiple orthogonal components. As illustrated in Fig. 6, the data are from the web site metabolomics workbench , and the corresponding study ID is “ST000101” which presents an NMR analysis of synthetic mixtures. Each of a total of 10 samples has 20 synthetic metabolites. All of the samples can be divided into two groups because the quantities of the 10 metabolites are quite different. After a Java based automatic phase, baseline correction, and integration of binned spectrals, the data set is analyzed by a PCA that is implemented with the proposed SpinSPJ scripting system. The script for the PCA includes reading the original data, calculating the average and standard deviation, conducting a PCA, and displaying the columnar and scattered data. The CPython libraries of NumPy, scikit-learn, and matplotlib are helpful to implement above requirements. The histogram gives the result of the 1st–8th principal components and their contribution percentages. Among all of the components, the first principal component can explain 58.02% of the information in the total samples. The score scatter plot shows the score values of each sample on the first and second principal component, where “·” denotes the published result of the study and “×” denotes the calculated score values by the scripts. Obviously, on the first dimension of the PCA, the samples are divided into group A and B, which is consistent with the metabolite composition of the samples. The data of the score scatter plot are consistent with the published results of the study . This example indicates that the scripting system can support complex multivariate statistical algorithms and chart displays with the help of native CPython libraries.
The third example focuses on deep learning NMR (DLNMR)  for non-uniform sampling (NUS)[41, 42]. NUS is an emerging NMR field that can accelerate multi-dimensional experiments of biomacromolecule as well as reducing the heating effect due to RF excitation. Deep learning based NUS methods have achieved accurate and fast reconstruction. The script reconstructs the spectra with a smart dense convolution neural network (DCNN) , which has been trained with simulated or acquired data. A DCNN requires a variety of libraries to implement the convolution and data fitting capabilities. SpinSPJ can integrate all of the related libraries (such as TensorFlow, keras, cuDNN) in the CPython environment. As illustrated in Fig. 7, a 3D HNCO NMR data of Azurin (molecular weight is 14 kDa) with full sampling was downloaded from the MddNMR website http://mddnmr.spektrino.com. A fair comparison between full and 20% sampling with respect to the spectral quality are shown in (a)–(f). In Fig. 7, (a) and (c) are the sub-regions of the projections on the planes of 15N–1H and 15N–13C for the fully sampled 3D spectrum, which is reconstructed by fast Fourier Transform. (b) and (d) are the corresponding reconstruction results of a 20% sampling rate, which is reconstructed by a DLNMR method. (e) gives the correlation coefficient of the peak intensities of (a) and (b); (f) gives the correlation coefficient of the peak intensities of (c) and (d). The correlation coefficients of peak intensities between DLNMR reconstructed and fully sampled spectrums are greater than 0.99, which indicates excellent fidelity of the two spectrums. For the 3D NMR, the achieved acceleration factor of 5 in NUS implies that the experimental time can be reduced from 22.4 to 4.48 h. In addition, the computation time for the 3D reconstruction is 9.66 s (data size: 732*60*60, GPU: NVIDIA Tesla K40m), which demonstrates the method achieves very high computational efficiency. This example shows the capability of the scripting system to implement a deep learning based NMR method by leveraging native CPython libraries.
SpinSPJ is a novel NMR scripting system which offers general functionality for instrument control and data processing; in addition, it can leverage native CPython libraries. Conventional instrument control and data processing are implemented by Java programming language; CPython libraries are helpful for providing advanced algorithms such as fast numerical computation, artificial intelligence etc. More advanced NMR functionality such as chemical shift and protein structure prediction are going to be integrated in the future.
SpinSPJ can be downloaded free of charge by visiting the website: http://www.spinstudioj.net/spinspj. The source code is private and owned by Zhongke-Niujin MR Tech Co.Ltd. The released software products are freely available to any researcher wishing to use them for non-commercial purposes, and licenses are needed for commercial purposes. The coded scripts are available in the GitHub repository https://github.com/qonenmr/spinspj.
Senior AW, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T, Qin CL, Zídek A, Nelson AWR, Bridgland A, Penedones H, Petersen S, Simonyan K, Crossan S, Kohli P, Jones DT, Silver D, Kavukcuoglu K, Hassabis D. Improved protein structure prediction using potentials from deep learning. Nature. 2020;577:706–10.
Badenhorst M, Barry CJ, Swanepoel CJ, van Staden CT, Wissing J, Rohwer JM. Workflow for data analysis in experimental and computational systems biology: using python as ‘glue.’ Processes. 2019;7(7):460–76.
Hyberts SG, Milbradt AG, Wagner AB, Arthanari H, Wagner G. Application of iterative soft thresholding for fast reconstruction of NMR data non-uniformly sampled with multidimensional Poisson Gap scheduling. J Biomol NMR. 2012;52:315–27.
We gratefully acknowledge Mr. Rui Chen, Mr. Rui Cao, Mr. Xuedong Zheng, and Ms. Qingyuan Li for their patient programming assistance. We sincerely appreciate the fruitful discussions with Prof. Xiaobo Qu, Prof. Xianzhong Yan, and Mr. Shigan Chai. We also thank EditSprings (https://www.editsprings.com) for the expert linguistic services.
No funding was obtained for this study.
Authors and Affiliations
State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, Wuhan Center for Magnetic Resonance, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Wuhan, 430071, People’s Republic of China
Zhongke-Niujin MR Tech Co. Ltd, Wuhan, 430075, People’s Republic of China
Zao Liu & Kan Song
Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance Research, Xiamen University, Xiamen, 361005, People’s Republic of China
ZL completed the programming task and wrote the manuscript. ZWC designed the main framework of the scripting system. KS designed the script examples presented in this article. All authors read and approved the final manuscript.
. Example script files of SpinSPJ, including automation, searching shimming, baseline correction, PCA, NUS, etc.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Liu, Z., Chen, Z. & Song, K. SpinSPJ: a novel NMR scripting system to implement artificial intelligence and advanced applications.
BMC Bioinformatics22, 581 (2021). https://doi.org/10.1186/s12859-021-04492-y