Cloud Pak for Data 4.5 and Watson Knowledge Catalog Installation Guide

kapil rajyaguru
4 min readAug 18, 2022
Photo by fabio on Unsplash

IBM Cloud Pak for Data is a unified data and AI platform that connects the right data, at the right time, to the right people anywhere. Running on the Red Hat OpenShift platform simplifies data access, automates data discovery and curation, and safeguards sensitive information by automating policy enforcement for all users in your organization. Make better data-driven decisions and lay the foundation for AI with a data fabric that connects siloed data on-premises or across multiple clouds without data movement. Discover actionable insights and apply trusted data to build, run, automate and manage AI models.

This article provides step-by-step instructions to install IBM Cloud Pak for Data 4.5.0 with Watson Knowledge Catalog (WKC) on the Red Hat OpenShift cluster. In this blog, I used the express install method, which pulls the CPD and WKC container images directly from the IBM registry. Because there are frequent updates in the installation method and types of installation, I would recommend using IBM installation documentation for the latest update. I have provided the IBM installation documentation link at the end of this blog under other useful resources.

Note: IBM® Cloud Pak for Data images are accessible from the IBM Entitled Registry. In most situations, it is strongly recommended that you mirror the necessary software images from the IBM Entitled Registry to a private container registry. Because we are deploying for demo purposes in this example, I have skipped mirroring IBM Cloud Pak for Data images in the private container registry.

Assumptions

  • Installing fresh Cloud pak for data 4.5.0 with Watson Knowledge Catalog
  • Red Hat OpenShift cluster has access to a high-speed internet connection and can pull images directly from IBM Entitled Registry.
  • Installing for demo purposes and so, the latest version of the software will automatically install on the Red Hat OpenShift cluster.
  • User has knowledge and experience managing Red Hat OpenShift cluster

Pre-Requisite

  • Red Hat OpenShift cluster version 4.6 or later with min 48 vCPU and 192 GB RAM
  • OpenShift Container Storage (OCS) is attached to the Red Hat OpenShift cluster.
  • A User with OpenShift Cluster and Project Administrator access
  • Bastion host with 2 vCPU and 4GB RAM with Linux OS
  • Internet access for Bastion host and Red Hat OpenShift cluster
  • The workstation must have a supported container runtime, either docker or podman.
  • IBM Cloud Pak for Data Entitlement Key — Here is the link to download the entitlement key

Step-by-Step Instructions

Step 1: Download and Install the CPD-Cli utility on the bastion host from this link.

Step 2: Download and Install Openshift Cli utility on the bastion host from this link

Step 3: Log in to your IBM container software library using this link and copy the entitlement key in a text file.

Step 4: Use this link to create and configure the environment variable file on the bastion host. Ensure your run the following commands to make the file accessible and available for installation.

bash ./cpd_vars.shchmod 700 cpd_vars.shsource ./cpd_vars.sh

Step 5: Log in to your Red Hat OpenShift Container Platform as a cluster administrator and create the appropriate projects for your environment.

oc login ${OCP_URL}oc new-project ${PROJECT_CPFS_OPS}oc new-project ${PROJECT_CPD_INSTANCE}

Step 6: Log in to OCP cluster using CPD-CLI and Run the following command to create a custom SCC for Watson Knowledge Catalog

cpd-cli manage login-to-ocp --username=${OCP_USERNAME} --password=${OCP_PASSWORD} --server=${OCP_URL}cpd-cli manage apply-scc --cpd_instance_ns=${PROJECT_CPD_INSTANCE} --components=wkc
  • If you want to confirm that the wkc-iis-sa service account can use the wkc-iis-scc SCC, run:
oc adm policy who-can use scc wkc-iis-scc \
--namespace ${PROJECT_CPD_INSTANCE} | grep "wkc-iis-sa"

Step 7: If you are using HAProxy to access the OCP cluster, Change the Load balancer setting using this link

Step 8: Run the following command to change and apply CRI-O container settings

cpd-cli manage apply-crio --openshift-type=${OPENSHIFT_TYPE} --extra-vars="pid_limit=16384"

Step 9: Run the following command to change kernel parameter settings

cpd-cli manage apply-db2-kubelet \
--openshift-type=${OPENSHIFT_TYPE}

Step 10: Run the following command to Update the global image pull secret

cpd-cli manage add-icr-cred-to-global-pull-secret \
${IBM_ENTITLEMENT_KEY}

Step 11: Run the following commands to Install the IBM Cloud Pak for Data platform and services

  • Create the OLM objects for the specified components:
cpd-cli manage apply-olm \
--release=${VERSION} \
--components=${COMPONENTS}
  • You can optionally run the cpd-cli manage get-olm-artifacts command to get the list of catalog sources and operator subscriptions on the cluster.
cpd-cli manage get-olm-artifacts \
--subscription_ns=${PROJECT_CPFS_OPS}

Step 12: Run the following command to begin Installing components in an express installation

cpd-cli manage apply-cr \
--components=${COMPONENTS} \
--release=${VERSION} \
--cpd_instance_ns=${PROJECT_CPD_INSTANCE} \
--block_storage_class=${STG_CLASS_BLOCK} \
--file_storage_class=${STG_CLASS_FILE} \
--license_acceptance=true

Step 13: Run the following command to get the status of the installed components in the specified project.

cpd-cli manage get-cr-status \
--cpd_instance_ns=${PROJECT_CPD_INSTANCE}

Step 14: Get the URL and default credentials for the web client:

cpd-cli manage get-cpd-instance-details \
--cpd_instance_ns=${PROJECT_CPD_INSTANCE} \
--get_admin_initial_credentials=true

Other Useful Resources

--

--

kapil rajyaguru

Enabling Organizations with IT Transformation & Cloud Migrations | Principal CSM Architect at IBM, Ex-Microsoft, Ex-AWS. My opinions are my own.