IBM Cloud Pak for Data Installation

6 min readFeb 2, 2022

IBM Cloud Pak for Data is a unified data and AI platform that connects the right data, at the right time, to the right people anywhere. Running on the Red Hat OpenShift platform simplifies data access, automates data discovery and curation, and safeguards sensitive information by automating policy enforcement for all users in your organization. Make better data-driven decisions and lay the foundation for AI with a data fabric that connects siloed data on-premises or across multiple clouds without data movement. Discover actionable insights and apply trusted data to build, run, automate and manage AI models.

This article provides step-by-step instructions to install IBM Cloud Pak for Data on the Red Hat OpenShift cluster. However, before we begin the installation, let’s ensure the following assumptions and pre-requisite are met.

Note: IBM® Cloud Pak for Data images are accessible from the IBM Entitled Registry. In most situations, it is strongly recommended that you mirror the necessary software images from the IBM Entitled Registry to a private container registry. Because we are deploying for demo purposes in this example, I have skipped mirroring IBM Cloud Pak for Data images in the private container registry.

Assumptions

Installing fresh Cloud pak for data Control Plane, Foundational Services and Operators
Red Hat OpenShift cluster has access to a high-speed internet connection and can pull images directly from IBM Entitled Registry.
Installing for demo purposes and so, the latest version of the software will automatically install on the Red Hat OpenShift cluster.
User has knowledge and experience managing Red Hat OpenShift cluster

Pre-Requisite

Red Hat OpenShift cluster version 4.6 or later with min 48 vCPU and 192 GB RAM
Bastion host with 2 vCPU and 4GB RAM with Linux OS
Internet access for Bastion host and Red Hat OpenShift cluster
OpenShift Container Storage (OCS) attached to Red Hat OpenShift cluster. This link will help you determine supported storage. In this demo, I have used OCS Storage.
A User with OpenShift Cluster and Project Administrator access
IBM Cloud Pak for Data Entitlement Key — Here is the link to download the entitlement key

Step-by-Step Instructions

Step 1: Download files from the GitHub repo using the following command. The repo contains the YAML files and text files. I have observed that the formatting of commands is changing on the Medium blog and hence request you to copy and paste commands from text files.

git clone https://github.com/kapilrajyaguru/Cloud-Pak-for-Data-Install.git

After downloading files, switch to the Cloud-Pak-for-Data-Install directory.

cd Cloud-Pak-for-Data-Install/

Step 2: Setting up projects (namespaces) on Red Hat OpenShift Container Platform

Login to your OpenShift cluster by executing the following command. Make sure the user you use has cluster and project admin access.

oc login -u username -p password clusterurl:port

Create the appropriate projects by running the following commands.

oc new-project ibm-common-services
oc new-project cpd-operators
oc new-project cpd-instance

Create the operator group for the IBM Cloud Pak foundational services project. The following example uses the recommended project name (ibm-common-services):

oc apply -f OperatorGroup.yaml

Step 3: Configuring your cluster to pull Cloud Pak for Data images

Run the following command to generate a JSON file called .dockerconfigjson in the current directory.

oc extract secret/pull-secret -n openshift-config

Encode the username and password using Base64 encoding. Replace entitlement-key with your IBM Cloud Pak entitlement key.

echo -n "cp:entitlement-key" | base64 -w0

Add an entry for the container registry to the auths section in the .dockerconfigjson file. In the following example, 1 is the new entry, and 2 is the existing entry.
Replace registry-location with cp.icr.io and auth with base64 encoded credentials. Enter base64-encoded-credentials that you generated in the previous step.

vi .dockerconfigjson{    
  "auths":{
      1 "registry-location":{ 
        "auth":"base64-encoded-credentials",
        "email":"not-used"
          },
      2 "myregistry.example.com":{
        "auth":"b3Blb=",
        "email":"not-used"
         }
     }
}

Apply the new configuration by running the following command. It will restart your master and worker nodes. So, you have to wait for at least 20 min for your nodes to come back and be ready.

oc set data secret/pull-secret -n openshift-config --from-file=.dockerconfigjson=.dockerconfigjsonsleep 1200

Step 4: Creating catalog sources that automatically pull the latest images from the IBM Entitled Registry

Create IBM Operator Catalog using the following command.

oc apply -f OperatorCatalog.yamlsleep 150

Verify that the IBM Operator Catalog was successfully created.

oc get catalogsource -n openshift-marketplace

Verify that ibm-operator-catalog is READY. It might take several minutes before the catalog source is ready. If the command does not return READY, wait a few minutes and try to verify the status again.

oc get catalogsource -n openshift-marketplace ibm-operator-catalog -o jsonpath='{.status.connectionState.lastObservedState} {"\n"}'

Step 5: Installing IBM Cloud Pak foundational services

Create the following operator subscription for your environment. The catalog that the operator subscription points to depends on the type of catalog source that you created and the location from which the cluster pulls images.

oc apply -f OperatorSubscription.yamlsleep 300

Verify the status of ibm-common-service-operator, Customer resource definitions, and IBM Cloud Pak Foundational services

oc --namespace ibm-common-services get csvoc get crd | grep operandrequestoc api-resources --api-group operator.ibm.com

Step 6: Creating the License service operator subscription

Submit the following operand request to install the License Service operator in the project where you plan to install the Cloud Pak for Data software.

oc apply -f LicSrvOpr.yamlsleep 150

Run the following command to confirm that the operand request was created. Verify that the command returns Running. If the command returns Initialized or Installing, wait several minutes and rerun the command.

oc get pod -n ibm-common-services -l app.kubernetes.io/name=ibm-licensing -o jsonpath='{.items[0].status.phase} {"\n"}'

Step 7: Creating an operator subscription for the scheduling service

Create a scheduling service operator subscription for your environment.

oc apply -f sched.yamlsleep 150

Run the following command to confirm that the subscription was triggered. Verify that the command returns ibm-cpd-scheduling-operator.v1.3.2.

oc get sub -n ibm-common-services ibm-cpd-scheduling-catalog-subscription -o jsonpath='{.status.installedCSV} {"\n"}'

Run the following command to confirm that the cluster service version (CSV) is ready. Verify that the command returns Succeeded: install strategy completed with no errors

oc get csv -n ibm-common-services ibm-cpd-scheduling-operator.v1.3.2 -o jsonpath='{ .status.phase } : { .status.message} {"\n"}'

Run the following command to confirm that the operator is ready. First, verify that the command returns an integer greater than or equal 1. If the command returns 0, wait for the deployment to become available.

oc get deployments -n ibm-common-services -l olm.owner="ibm-cpd-scheduling-operator.v1.3.2" -o jsonpath="{.items[0].status.availableReplicas} {'\n'}"

Step 8: Creating an operator subscription for the IBM Cloud Pak for Data platform operator

Create the following operator subscription.

oc apply -f cpdoprator.yamlsleep 150

Run the following command to confirm that the subscription was triggered. Verify that the command returns cpd-platform-operator.v2.0.6.

oc get sub -n ibm-common-services cpd-operator -o jsonpath=’{.status.installedCSV} {"\n"}’

Run the following command to confirm that the cluster service version (CSV) is ready. Verify that the command returns Succeeded: install strategy completed with no errors

oc get csv -n ibm-common-services cpd-platform-operator.v2.0.6 -o jsonpath=’{ .status.phase } : { .status.message} {"\n"}’

Run the following command to confirm that the operator is ready. First, verify that the command returns an integer greater than or equal 1. If the command returns 0, wait for the deployment to become available.

oc get deployments -n ibm-common-services -l olm.owner="cpd-platform-operator.v2.0.6" -o jsonpath=”{.items[0].status.availableReplicas} {'\n'}"

Step 9: Installing Cloud Pak for Data

Enable the IBM Cloud Pak for Data platform operator and the IBM Cloud Pak foundational services operator to watch the project where you will install IBM Cloud Pak for Data

oc apply -f cpdinstall.yamlsleep 60

Run the following command to create a Storage resource to install Cloud Pak for Data. I have used OpenShift Cluster Storage (OCS). If you use different storage, please find the relevant code at step #3 for your storage here.

oc apply -f storage.yamlsleep 60

Change to the project where you installed Cloud Pak for Data. For example:

oc project cpd-instance

Run the following command to determine whether the ibmcpd-cr has been created. If the output is “InProgress,” then Wait for at least 90 minutes. Then, run the command again.

sleep 5400oc get Ibmcpd ibmcpd-cr -o jsonpath="{.status.controlPlaneStatus}{'\n'}"

Run the following command to determine whether the control plane is ready.

oc get ZenService lite-cr -o jsonpath="{.status.zenStatus}{'\n'}"

Get the URL of the Cloud Pak for the Data web client.

oc get ZenService lite-cr -o jsonpath="{.status.url}{'\n'}"

Get the initial password for the admin user.

oc extract secret/admin-user-details --keys=initial_admin_password --to=-

I hope this quick step-by-step guide will help you quickly deploy IBM cloud pak for data on the Red Hat OpenShift cluster running on Azure.

IBM Cloud Pak for Data Installation

Assumptions

Pre-Requisite

Step-by-Step Instructions

Other Useful Resources

Written by kapil rajyaguru