Service Authorization in Hadoop

Hadoop as we know, consists of HDFS and Yarn as sub components. There is Hadoop Core (common) which is a common library shared by HDFS and Yarn. While HDFS and Yarn has its own specific resource specific access controls,  there are some controls which apply commonly to both HDFS and Yarn.  These are termed as service level authorization.

 

 

 

 

They get applied right after authentication and before applying resource specific access controls.

Let’s say you want to read a file in HDFS. First hadoop will authenticate the user. Next service level authorization checks are performed to determine whether user is authorized to access HDFS and finally the file level permissions of the user are checked.

To enable service level authorization, set hadoop.security.authorization to true in core-site.xml.  For this change to effective, we need to restart the hadoop services.

Then update hadoop-policy.xml to specify the users/groups who are permitted to access the specific protocol belong to HDFS or yarn. These changes can be made effective without a restart by invoking

dfsadmin/rmadmin -refreshServiceAcl

 

Please see apache documentation on service Level Authorization for more details.

 

You May Also Like

About the Author: Benoy Antony

I am an Apache Hadoop Committer and has been working as an engineer/architect at companies like eBay and Paypal. Please check my LinkedIn Profile for the full profile.

Leave a Reply

Your email address will not be published. Required fields are marked *

Bitnami