Patrick Titzler for IBM Developer

Posted on Apr 1, 2021

Elyra 2.2: R support, updated CLI, and more

#jupyter #datascience #kubeflow #apacheairflow

It's only been a couple weeks since the Elyra open source community published version 2.1, which introduced experimental support for Apache Airflow.

Version 2.2 delivers a couple of exciting enhancements that our growing user base had on their wishlist. In this blog post I'll summarize the highlights:

New R script editor
Improved pipeline editor with support for R scripts
Deployment on Kubeflow Notebook servers
Extended command line interface with support for pipeline execution

As always, you can find the complete list of features and bug fixes that made it into the release in the changelog.

Edit and run R scripts

The Elyra Script Editor was extended to support the R language. You can therefore now create, edit, and run R scripts in JupyterLab in addition to Python scripts.

By installing the optional Language Server Protocol support for R, you can take advantage of productivity features you are likely familiar with from other IDEs, such as code linting and auto-completion.

We realize that this editor can't compete with RStudio, but its start!

If you've used the Jupyter notebooks in an enterprise deployment, you are probably familiar with the Jupyter Enterprise Gateway (JEG). In fact, even if you are not familiar with it, you might have used it. What it does - in a nutshell - is to give you the ability to run notebooks in remote kernels, allowing for better resource allocation and usage.

Watson Studio is one example of a managed enterprise service that leverages the JEG.

Because it is extending JupyterLab functionality, Elyra can take advantage of the Enterprise Gateway as well. One little known feature of Elyra though is it's ability to also allow for remote execution without the need for JEG through it's support for Kubeflow Pipelines or Apache Airflow.

If you have access to a Kubeflow Pipelines or Apache Airflow deployment, you can run R scripts (just like Python scripts and Jupyter notebooks) in those deployments directly from the editor. This is especially useful for scripts that require resources that are not available (or not sufficiently available) in your local environment.

Run R scripts in pipelines

In the Visual Pipeline Editor you can now assemble pipelines from multiple R scripts, or mix R scripts with Jupyter notebooks and Python scripts, as necessary.

You can run these pipelines locally in JupyterLab or remotely on Kubeflow Pipelines or Apache Airflow.

If you are new to Elyra pipelines, take a look at the tutorials. They guide you through the process of creating and running a pipeline in various environments.

Use Elyra to run Kubeflow-hosted notebooks

Elyra can be deployed locally or in remote environments.

A local deployment typically serves only a single user and is created by installing Elyra from PyPI, conda, source code, or pulling a ready-to-use container image.

Remote deployments, such as in a data center or the cloud, are typically used when support for many users is required.

A common approach is to deploy JupyterHub on Kubernetes and configure it for Elyra, like it is done in Open Data Hub on the Red Hat OpenShift Container platform.

If you already have Kubeflow deployed and don't want to provision a dedicated instance of JupyterHub to serve notebooks, we've got great news for you. We've recently started to publish custom Elyra container images on Docker Hub and quay.io that you can use to run JupyterLab with Elyra on Kubeflow Notebook Servers. All you need to do is specify the Elyra container image name and (version) tag when you configure a new notebook server and you are good to go!

Extended command line interface

As an extension to JupyterLab, Elyra is primarily GUI driven. However, there are certain tasks that can also be completed using the elyra-metadata command line interface:

The Elyra command line interface was extended in version 2.2 to support running of pipelines in local and remote environments. Initially this capability is only exposed through the elyra-pipeline command line interface, but work is on the way to provide a unified interface.

Run pipelines locally

Specify the run command to execute the pipeline locally, passing the pipeline file name as parameter, like so:

$ elyra-pipeline run /path/to/hello-world.pipeline

This feature is still under active development, e.g. to visualize execution progress.

Run pipelines remotely

Specify the submit command to run the pipeline on Kubeflow Pipelines or Apache Airflow, passing the pipeline file name and the runtime configuration name as parameters, like so:

$ elyra-metadata list runtimes
 Schema    Instance     Resource
 ------    --------     --------
 kfp       kfp_test_env  /.../runtimes/kfp_test_env.json 

$ elyra-pipeline submit --runtime-config kfp_test_env /path/to/hello-world.pipeline
  ...

If the pipeline was successfully submitted for execution, the command returns a GUI link that you can use to monitor the progress and a link to the cloud storage where the pipeline run artifacts are stored.

Improved usability

If you've used previous releases of Elyra, you should notice quite a few usability improvements that we've made. There's no denying that the Elyra project has matured a lot, since it was started about a year ago.

Coming up next

We've just started work for our next releases. There's plenty of stuff brewing in our lab. If you'd like to get the inside scoop, check out our discussion forum, chat with us, or join the weekly community meeting.

DEV Community

Elyra 2.2: R support, updated CLI, and more

Edit and run R scripts

Run R scripts in pipelines

Use Elyra to run Kubeflow-hosted notebooks

Extended command line interface

Run pipelines locally

Run pipelines remotely

Improved usability

Coming up next

Top comments (0)

Read next

Breakthrough AI Model Sundial Achieves 30% Better Time Series Predictions Across 12 Benchmarks

How AI Systems Think: New Framework Reveals Machine Reasoning Through 'Thought Logging'

AI Models Get Human-Like Memory with New Test-Time Regression Framework

Revolutionary Two-Layer Framework Makes Agent-Based Models More Realistic and Adaptive