[120] | 1 | Thrift API for HDFS |
---|
| 2 | ================== |
---|
| 3 | |
---|
| 4 | Introduction: |
---|
| 5 | ============ |
---|
| 6 | |
---|
| 7 | The Hadoop Distributed File System is written in Java. An application |
---|
| 8 | that wants to store/fetch data to/from HDFS can use the Java API |
---|
| 9 | This means that applications that are not written in Java cannot |
---|
| 10 | access HDFS in an elegant manner. |
---|
| 11 | |
---|
| 12 | Thrift is a software framework for scalable cross-language services |
---|
| 13 | development. It combines a powerful software stack with a code generation |
---|
| 14 | engine to build services that work efficiently and seamlessly |
---|
| 15 | between C++, Java, Python, PHP, and Ruby. |
---|
| 16 | |
---|
| 17 | This project exposes HDFS APIs using the Thrift software stack. This |
---|
| 18 | allows applciations written in a myriad of languages to access |
---|
| 19 | HDFS elegantly. |
---|
| 20 | |
---|
| 21 | |
---|
| 22 | The Application Programming Interface (API) |
---|
| 23 | =========================================== |
---|
| 24 | The HDFS API that is exposed through Thrift can be found in if/hadoopfs.thrift. |
---|
| 25 | |
---|
| 26 | Compilation |
---|
| 27 | =========== |
---|
| 28 | The compilation process creates a server org.apache.hadoop.thriftfs.HadooopThriftServer |
---|
| 29 | that implements the Thrift interface defined in if/hadoopfs.thrift. |
---|
| 30 | |
---|
| 31 | Th thrift compiler is used to generate API stubs in python, php, ruby, |
---|
| 32 | cocoa, etc. The generated code is checked into the directories gen-*. |
---|
| 33 | The generated java API is checked into lib/hadoopthriftapi.jar. |
---|
| 34 | |
---|
| 35 | There is a sample python script hdfs.py in the scripts directory. This python |
---|
| 36 | script, when invoked, creates a HadoopThriftServer in the background, and then |
---|
| 37 | communicates wth HDFS using the API. This script is for demonstration purposes |
---|
| 38 | only. |
---|
| 39 | |
---|