Configuring Graph Lakehouse for Kerberos Authentication

If you plan to load data to Graph Lakehouse from an HDFS file store that uses Kerberos authentication, follow the steps below to configure Graph Lakehouse for Kerberos authentication.

  1. In order to be able to generate an authentication token for requesting encrypted ticket-granting tickets (TGT) from the key distribution center (KDC), each Graph Lakehouse host server must include the Kerberos workstation package, krb5-workstation. On each server in the cluster, run the following command to install the package:
    sudo yum install -y krb5-workstation
  2. In order to establish a connection to the KDC, Graph Lakehouse must have a copy of the KDC's krb5.conf file. Place a copy of krb5.conf in the /etc directory on each Graph Lakehouse host server.
  3. In addition to krb5.conf, each Graph Lakehouse server needs a copy of the .keytab file from the principal node. The keytab file and principal name are used to generate an authentication token.

    To find the location of the .keytab file and the principal name, you can look up the dfs.web.authentication.kerberos.keytab and dfs.web.authentication.kerberos.principal values in hdfs-site.xml on the HDFS master node.

    Copy the .keytab file to any location on each Graph Lakehouse host server, and then run the following command to generate the authentication token:

    kinit -p <principal_name> -k -t <path>/<keytab_file>

    Where <principal_name> is the Kerberos principal name and <path>/<keytab_file> is the location and name of the .keytab file.