source: proiecte/HadoopJUnit/hadoop-0.20.1/src/docs/src/documentation/content/xdocs/libhdfs.xml @ 120

Last change on this file since 120 was 120, checked in by (none), 14 years ago

Added the mail files for the Hadoop JUNit Project

  • Property svn:executable set to *
File size: 3.7 KB
Line 
1<?xml version="1.0"?>
2<!--
3  Copyright 2002-2004 The Apache Software Foundation
4
5  Licensed under the Apache License, Version 2.0 (the "License");
6  you may not use this file except in compliance with the License.
7  You may obtain a copy of the License at
8
9      http://www.apache.org/licenses/LICENSE-2.0
10
11  Unless required by applicable law or agreed to in writing, software
12  distributed under the License is distributed on an "AS IS" BASIS,
13  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14  See the License for the specific language governing permissions and
15  limitations under the License.
16-->
17
18<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN"
19          "http://forrest.apache.org/dtd/document-v20.dtd">
20
21
22<document>
23<header>
24<title>C API to HDFS: libhdfs</title>
25<meta name="http-equiv">Content-Type</meta>
26<meta name="content">text/html;</meta>
27<meta name="charset">utf-8</meta>
28</header>
29<body>
30<section>
31<title>C API to HDFS: libhdfs</title>
32
33<p>
34libhdfs is a JNI based C api for Hadoop's DFS. It provides C apis to a subset of the HDFS APIs to manipulate DFS files and the filesystem. libhdfs is part of the hadoop distribution and comes pre-compiled in ${HADOOP_HOME}/libhdfs/libhdfs.so .
35</p>
36
37</section>
38<section>
39<title>The APIs</title>
40
41<p>
42The libhdfs APIs are a subset of: <a href="api/org/apache/hadoop/fs/FileSystem.html" >hadoop fs APIs</a>
43</p>
44<p>
45The header file for libhdfs describes each API in detail and is available in ${HADOOP_HOME}/src/c++/libhdfs/hdfs.h
46</p>
47</section>
48<section>
49<title>A sample program</title>
50
51<source>
52#include "hdfs.h"
53
54int main(int argc, char **argv) {
55
56    hdfsFS fs = hdfsConnect("default", 0);
57    const char* writePath = "/tmp/testfile.txt";
58    hdfsFile writeFile = hdfsOpenFile(fs, writePath, O_WRONLY|O_CREAT, 0, 0, 0);
59    if(!writeFile) {
60          fprintf(stderr, "Failed to open %s for writing!\n", writePath);
61          exit(-1);
62    }
63    char* buffer = "Hello, World!";
64    tSize num_written_bytes = hdfsWrite(fs, writeFile, (void*)buffer, strlen(buffer)+1);
65    if (hdfsFlush(fs, writeFile)) {
66           fprintf(stderr, "Failed to 'flush' %s\n", writePath);
67          exit(-1);
68    }
69   hdfsCloseFile(fs, writeFile);
70}
71
72</source>
73</section>
74
75<section>
76<title>How to link with the library</title>
77<p>
78See the Makefile for hdfs_test.c in the libhdfs source directory (${HADOOP_HOME}/src/c++/libhdfs/Makefile) or something like:
79gcc above_sample.c -I${HADOOP_HOME}/src/c++/libhdfs -L${HADOOP_HOME}/libhdfs -lhdfs -o above_sample
80</p>
81</section>
82<section>
83<title>Common problems</title>
84<p>
85The most common problem is the CLASSPATH is not set properly when calling a program that uses libhdfs. Make sure you set it to all the hadoop jars needed to run Hadoop itself. Currently, there is no way to programmatically generate the classpath, but a good bet is to include all the jar files in ${HADOOP_HOME} and ${HADOOP_HOME}/lib as well as the right configuration directory containing hdfs-site.xml
86</p>
87</section>
88<section>
89<title>libhdfs is thread safe</title>
90<p>Concurrency and Hadoop FS "handles" - the hadoop FS implementation includes a FS handle cache which caches based on the URI of the namenode along with the user connecting. So, all calls to hdfsConnect will return the same handle but calls to hdfsConnectAsUser with different users will return different handles.  But, since HDFS client handles are completely thread safe, this has no bearing on concurrency.
91</p>
92<p>Concurrency and libhdfs/JNI - the libhdfs calls to JNI should always be creating thread local storage, so (in theory), libhdfs should be as thread safe as the underlying calls to the Hadoop FS.
93</p>
94</section>
95</body>
96</document>
Note: See TracBrowser for help on using the repository browser.