Apache Ranger
Data Security

Once the user's identity has been established through authentication, security needs to ensure which actions or services that identity can access.
Data security refers to the protection measures employed to safeguard digital information against unauthorized access, corruption, or theft throughout its lifecycle, preserving its confidentiality, integrity, and availability.
Digital transformation has profoundly changed how today's businesses operate and compete. The vast volume of data they create, handle, and store makes environments more complex, expanding the attack surface and making it more challenging to monitor and protect.
At the same time, the growing public demand for data protection initiatives, with several new privacy regulations, joins the long-standing security provisions. The value of data has never been greater. The loss of trade secrets or intellectual property can affect innovation and profitability.
Additionally, there are increasing challenges inherent to the complexity of today's distributed and hybrid environments.
Fortunately, there are available cybersecurity tools whose protection can reduce security risks when used as part of a security audit. These tools must know where the data resides, track who has access to it, and block high-risk activities.

Apache Ranger Features
Apache Ranger is a framework designed to enable, monitor, and manage data security comprehensively on the Hadoop Platform.
With the implementation of Apache YARN, the Hadoop Platform started to support a broader data-lake architecture, allowing the execution of multiple workloads in a "multitenant" environment, which consequently demanded a more robust data security structure, with support and monitoring of data access and centralized management of security policies.
Apache Ranger was created with the following objectives:
- Centralized security administration:
To enable the management of all security-related tasks in a central UI or using REST APIs. - Fine-grained authorization:
To perform actions and operations with Hadoop components or tools and management through a centralized administration tool. - Standardize authorization methods:
Across all Hadoop components. - Enhance support for different authorization methods:
Role-based access control, attribute-based access control, etc. - Centralize auditing:
Centralization of user access auditing and administrative actions (related to security) across all Hadoop components.
Apache Ranger Architecture
The main components of Apache Ranger are:
-
Ranger Admin Server/Portal:
All policies are centrally managed through a web portal that acts as a central interface for security management and allows defining repositories, creating and updating policies, managing users and groups, defining audit policies, and viewing audit activities. The portal itself has three distinct parts:- Auditing:
Monitors user activities at the resource level and searches audit log entries based on some filters. - Policy Manager:
Adds or modifies policies for groups or users.
Specifies which users are admins, who can access or modify policies. - KMS:
Used to store keys used in HDFS data encryption.
- Auditing:
-
Ranger Plugins:
These are Java programs embedded in the processes of the cluster components. These plugins retrieve policies from a central server and store them locally, acting as an authorization module and evaluating user requests against the obtained security policies. Additionally, it sends this data to the audit server.

Services Supported by Apache Ranger
Currently, the following services are supported:
Component | Y/N |
---|---|
HDFS | YES |
Hive | YES |
HBase | YES |
Storm | YES |
Knox | YES |
SolR | YES |
Kafka | YES |
YARN | YES |
Ozone | YES |
Kudu | YES |
Kylin | YES |
NiFi | YES |
NiFi Registry | YES |
Sqoop | YES |
Atlas | YES |
ElasticSearch | YES |
Presto | YES |
Schema Registry | YES |
Best Practices for Apache Ranger
- For HDFS authorization, change the HDFS umask from 022 to 077 to prevent any new file or folder from being accessed by anyone other than the owner.
- Auditing in Apache Ranger can be controlled as a policy. Configure SSL for all enabled plugins.
- A default policy is created when Apache Ranger is installed via Ambari, for all files and directories in HDFS with auditing enabled. Ambari uses this policy to perform a smoke test via "Ambari QA" to verify HDFS services. A similar policy to enable auditing across all files and folders should be created if administrators disable this default policy.
Apache Ranger Project Details
Apache Ranger was predominantly developed in Java.

Sources: