Skip to main content
Version: Next

Apache Ambari Installation

Once the minimum requirements are met, all machines in the environment are prepared, and the installation packages are available, we proceed to the installation of the Apache Ambari service.

Once installed, Apache Ambari will be responsible for creating the Big Data Cluster, either through its web page or via REST API.

Installing the Ambari Server

Run the command below to install the Ambari Server via the installation script on a utility or edge machine 1:

Terminal input
sh ambari-tdp-installer-el-9.sh -b URL_BASE_TO_COMPONENT_PACKAGES -u USUARIO-p PASSWORD;

The USERNAME (-u) and PASSWORD (-p) parameters are optional and should only be provided when the package repository requires access credentials, as is the case with the Tecnisys Pacage Public Repository.

note

The following are the available options for the ambari-tdp-installer-el-9.sh installation script:

  • -b, --baseurl
    Base URL for the component packages.
    Default:

    Terminal input
    https://repo.tecnisys.com.br/yum/tdp/ambari/2.3.0/el-9-x86_64
  • -u, --username
    Username for access to the package repository.
    Should be provided when using the Tecnisys Package Repository URL.

  • -p, --password
    Password for access to the package repository.
    Should be provided when using the Tecnisys Package Repository URL.

  • -c, --component (agent | server | all)
    [ default: server ]
    Apache Ambari component to be installed.

Ambari Server Installation Scenarios

Below, we provide examples for different scenarios:

  • Installation using the Tecnisys Public Package Repository:

    • Terminal input
      sh /repo/yum/tdp/installer/2.3.0/el-9-x86_64/ambari-tdp-installer-el-9.sh --username "user" --password "pass"

    The access credentials (username and password) are set when you register for free on the tecnisysSite` website.

  • Installation using a local package repository:

    • Terminal input
      sh /repo/yum/tdp/installer/2.3.0/el-9-x86_64/ambari-tdp-installer-el-9.sh --baseurl "file:///opt/repo/tdp/ambari/2.3.0/el-9-x86_64"
  • Installation using a local package repository on the machine with IP 192.168.32.100 and accessible via HTTP:

    • Terminal input
      /repo/yum/tdp/installer/2.3.0/el-9-x86_64/ambari-tdp-installer-el-9.sh --baseurl "https://192.168.32.100/tdp/ambari/2.3.0/el-9-x86_64"

Installation via tdpctl

Ambari Server installation can also be performed through tdpctl, the TDP Command Line Interface:

  • Terminal input
    tdpctl install --help

The available options, as well as the information required to install Ambari Server, are the same as those requested by the installation script.

Figure 1 - Options for the tdpctl install command
Figure 1 - Options for the tdpctl install command

Note that it is possible to inform the desired component using the -c or --component (agent | server | all) option.

Figure 2 - Ambari Server installation via tdpctl
Figure 2 - Ambari Server installation via tdpctl

To install tdpctl, follow the guidance available here.

Configurations

Configuring the JDBC Driver

Configure the JDBC driver to be used by the Ambari Server to connect to the metadata database:

Terminal input
    ambari-server setup --jdbc-db=postgres --jdbc-driver=postgresql-42.2.16.jar 
tip

The PostgreSQL JDBC driver is also available in the public/ambari/3.0.0.3 directory of the Tecnisys Package Repository.

Figure 3 - JDBC Configuration
Figure 3 - JDBC Configuration

Configuring the Ambari Server

Configure the Ambari Server:

Terminal input
    ambari-server setup 

Next, provide the requested information at the presented prompt:

1- If SELinux is active, a warning will be displayed. Confirm if SELinux can be switched to permissive mode and temporarily disabled. Default (y)

2- Confirm if you want to use a custom OS user for the Ambari Server daemon. Default (n)

(a)- If yes (y), provide the desired user.

3- If the Firewall (iptables) is active, a warning will be displayed. Confirm if the ports used by Ambari are accessible. Default (y)

(a)- If there is a port conflict, an error will be displayed, and the configuration will be aborted.

4- Choose the JDK to be used. Default (1) Oracle JDK 1.8 + Java Cryptography Extension (JCE) Policy Files 8 (automatically downloaded from the public/tdp/2.0.0/rhel-7-x86_64 directory of the `Tecnisys Package Repository.

(a)- If option (1) is chosen, confirm if you accept the usage license. Default (y)
(b)- Choose option (2) to provide the JAVA_HOME path of a different JDK.

5- Confirm if you want the Ambari Server to download and install additional LZO compression packages. Default (n)

6- Confirm if you want to proceed with the Advanced Database Configuration. Default (n)

(a)- If no (n), Ambari automatically initializes a PostgreSQL instance (an installation dependency of the Ambari Server), creates its own metadata database, schema, user, metadata tables, etc.
(b)- If yes (y), connection information for the database will be requested, such as the Database Management System (DBMS), hostname, port, and Ambari database name, so that the Ambari Server can connect and automatically create its metadata tables and other structures.

Figure 4 - Ambari Server Configuration
Figure 4 - Ambari Server Configuration

Starting the Ambari Server

Start the Ambari Server daemon:

Terminal input
    ambari-server start 
Figure 5 - Ambari Server Initialization
Figure 5 - Ambari Server Initialization

After the Ambari Server starts, the Ambari web page will be accessible by default on port 8080.

Figure 6 - Ambari Login Page
Figure 6 - Ambari Login Page
note

If necessary, the Ambari web interface port can be changed in the /etc/ambari-server/conf/ambari.properties file.

Creating Metadata Databases

Many of the services/components of the TDP Platform require databases for metadata persistence.

Below, we provide guidelines for creating these databases in PostgreSQL (the default DBMS for the Ambari Server). If you have chosen a different DBMS for your environment, please refer to its official documentation.

warning

The passwords used in the following commands are intended only to exemplify the presented procedure. Please use strong passwords in your environment. Additionally, the user and database names are only suggestions, feel free to change them.

caution

Pay attention to the collation (locale and encoding settings, for example: en_US.UTF-8 and pt_BR.UTF-8) of the metadata databases. Some components may not function correctly or may have issues with displaying, comparing, and sorting data and information if the collation is not configured correctly.

Creating the Apache Airflow Metadata Database

  1. Create the airflow user:

    Terminal input
    su -c 'createuser -s -d -l airflow' - postgres; 
  2. Set the airflow user's password:

    Terminal input
    su -c "psql -U postgres -c \\"alter user airflow with password 'airflow'\\"" - postgres; 
  3. Create the airflow database:

    Terminal input
    su -c 'createdb -O airflow airflow' - postgres; 

Creating the Apache Druid Metadata Database

  1. Create the druid user:

    Terminal input
    su -c 'createuser -s -d -l druid' - postgres; 
  2. Set the druid user's password:

    Terminal input
    su -c "psql -U postgres -c \\"alter user druid with password 'druid'\\"" - postgres; 
  3. Create the druid database:

    Terminal input
    su -c 'createdb -O druid druid' - postgres; 

Creating the Apache Hive Metadata Database

  1. Create the hive user:

    Terminal input
    su -c 'createuser -s -d -l hive' - postgres; 
  2. Set the hive user's password:

    Terminal input
    su -c "psql -U postgres -c \\"alter user hive with password 'hive'\\"" - postgres; 
  3. Create the hive database:

    Terminal input
    su -c 'createdb -O hive hive' - postgres; 

### Creating the Apache Ranger Metadata Database

1. Create the ranger user:

```bash title="Terminal input"
su -c 'createuser -s -d -l ranger' - postgres;
  1. Set the ranger user's password:

    Terminal input
    su -c "psql -U postgres -c \\"alter user ranger with password 'ranger'\\"" - postgres; 
  2. Create the ranger database:

    Terminal input
    su -c 'createdb -O ranger ranger' - postgres; 

Creating the Apache RangerKMS Metadata Database

  1. Create the rangerkms user:

    Terminal input
    su -c 'createuser -s -d -l rangerkms' - postgres; 
  2. Set the rangerkms user's password:

    Terminal input
    su -c "psql -U postgres -c \\"alter user rangerkms with password 'rangerkms'\\"" - postgres; 
  3. Create the rangerkms database:

    Terminal input
    su -c 'createdb -O rangerkms rangerkms' - postgres; 

Creating the Apache Superset Metadata Database

  1. Create the superset user:

    Terminal input
    su -c 'createuser -s -d -l superset' - postgres; 
  2. Set the superset user's password:

    Terminal input
    su -c "psql -U postgres -c \\"alter user superset with password 'superset'\\"" - postgres; 
  3. Create the superset database:

    Terminal input
    su -c 'createdb -O superset superset' - postgres; 
    Figure 7 - Ambari Server Setup - metadata creation
    Figure 7 - Ambari Server Setup - metadata creation

Configuring the Metadata Databases Instance

Below, we provide guidelines and code examples for configuring access rules and properties of PostgreSQL (the default DBMS for the Ambari Server). If you have chosen a different DBMS for your environment, please refer to its official documentation.

note

Check, and if necessary, change the major version of PostgreSQL present in the configuration file paths.

  1. Add the other databases to the Ambari access rule:

    Terminal input
        sed -i 's/ambari,mapred/airflow,ambari,druid,hive,mapred,oozie,ranger,superset/g' /var/lib/pgsql/14/data/pg_hba.conf 
  2. Configure the PostgreSQL instance to accept connections from all network interfaces:

    Terminal input
        sed -i "s/#listen_addresses = 'localhost'/listen_addresses = '*'/g" /var/lib/pgsql/14/data/postgresql.conf 
  3. Increase the maximum number of connections supported by the PostgreSQL instance:

    Terminal input
        sed -i "s/max_connections = 100/max_connections = 500/g" /var/lib/pgsql/14/data/postgresql.conf 
  4. Restart the PostgreSQL service to apply all changes:

    Terminal input
        systemctl restart postgresql-14 
    Figure 8 - Metadata Configuration
    Figure 8 - Metadata Configuration
    warning

    Assess the impact of these configurations on the security and performance of your environment. If you need support, Contact Us, we will be happy to assist you.

Footnotes

  1. Utility machines are typically used for auxiliary tasks, such as cluster management, while Edge machines are dedicated to graphical interfaces or components used by users at the "edge" of the environment.