Flume- Getting an error when the file


8 Answer(s)


I see an issue here "/user/cloudera/flume//FlumeData.1415656620523.tmp". the temp file that it is creating has two slashes //. Check the path that is set for agent.sinks.hdfs-sink.hdfs.path. I guess you have it as "/user/cloudera/flume/" and if it is so then try modifying to "/user/cloudera/flume". So try changing the setting as below agent.sinks.hdfs-sink.hdfs.path = module9/cloudera/flume.

Thanks Sravan for checking!
I changed the agent.sinks.hdfs-sink.hdfs.path=/user/cloudera/flume

14/11/10 16:35:12 INFO instrumentation.MonitoredCounterGroup: Component type: SOURCE, name: src-1 started
14/11/10 16:35:27 INFO avro.ReliableSpoolingFileEventReader: Preparing to move file /home/cloudera/flumeSpool/employees.csv to /home/cloudera/flumeSpool/employees.csv.COMPLETED
14/11/10 16:35:27 INFO hdfs.HDFSDataStream: Serializer = TEXT, UseRawLocalFileSystem = false
14/11/10 16:35:28 INFO hdfs.BucketWriter: Creating /user/cloudera/flume/FlumeData.1415666127523.tmp
14/11/10 16:35:35 ERROR hdfs.HDFSEventSink: process failed
java.lang.NoSuchMethodError: com.google.common.cache.CacheBuilder.build()Lcom/google/common/cache/Cache;
at org.apache.hadoop.hdfs.DomainSocketFactory.(DomainSocketFactory.java:46)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:456)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:410)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:128)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2308)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:87)

Please note I am using CDH 4.4

hi Thomas,

Request you to post the flume config and the steps that you have used to create the directories and the steps to execute flume.

Thanks

Steps:
mkdir /home/cloudera/flumeSpool

sudo su
cp /home/cloudera/class2/flume-conf.properties /usr/lib/flume-ng/apache-flume-1.4.0-bin/conf/

hadoop fs -mkdir /user/cloudera/flume/

cd /usr/lib/flume-ng/apache-flume-1.4.0-bin/bin/

./flume-ng agent -n agent -c conf -f /usr/lib/flume-ng/apache-flume-1.4.0-bin/conf/flume-conf.properties

cp /home/cloudera/class2/input/employees.csv /home/cloudera/flumeSpool/

flume-conf.properties:
/usr/lib/flume-ng/apache-flume-1.4.0-bin/conf/flume-conf.properties

# example.conf: A single-node Flume configuration



# Name the components on this agent

agent.sources = src-1

agent.sinks = hdfs-sink

agent.channels = memory-channel



#Source properties, its a spolling source which will take data from directory /var/log/apache/flumeSpool

agent.sources.src-1.type = spooldir

agent.sources.src-1.spoolDir = /home/cloudera/flumeSpool

agent.sources.src-1.fileHeader = true



# Use a channel which buffers events in memory

agent.channels.memory-channel.type = memory

agent.channels.memory-channel.capacity = 1000

agent.channels.memory-channel.transactionCapacity = 100



#Sink properties, hdfs source which will store data here

agent.sinks.hdfs-sink.type = hdfs

agent.sinks.hdfs-sink.hdfs.path = /user/cloudera/flume

agent.sinks.hdfs-sink.hdfs.fileType = DataStream
agent.sinks.hdfs-sink.hdfs.rollCount = 20

This is the message I am getting - java.lang.NoSuchMethodError:

14/11/11 22:07:11 INFO hdfs.BucketWriter: Creating /user/cloudera/flume/FlumeData.1415772430830.tmp
14/11/11 22:07:19 ERROR hdfs.HDFSEventSink: process failed
java.lang.NoSuchMethodError: com.google.common.cache.CacheBuilder.build()Lcom/google/common/cache/Cache;
at org.apache.hadoop.hdfs.DomainSocketFactory.(DomainSocketFactory.java:46)
at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:456)






# Bind the source and sink to the channel

agent.sources.src-1.channels = memory-channel

agent.sinks.hdfs-sink.channel = memory-channel




Hi Thomas,
What is the reason for installing flume seperately.
Cloudeea VM already has flume installed.
The flume example that was provided is working fine . It was
tested with flume that came with clousera VM.

We suggest that you try out with default flume conf that was provided by
Dezyre.

Thanks

That's it!

I tried with the flume which comes with Cloudera-VM and it's working fine.

Thanks for your help!

Thomas,

Before trying with any new versions it's always safer to work with versions that ship with the VM. Once you gain proficiency, you can try with new versions but you also need to know what new changes are coming in various releases.

Thanks