import-bot (20211) [Avatar] Offline
[Originally posted by rapido]

I had a crazy thought today, that I wanted to toss out there for some feedback.

I have stepped into a project that has a data collector component. Data is
collected in a number of different forms, and a number of different ways.

The process for gathering data is unique to a particular data feed. These
feeds change over time. New feeds are added. Old feeds are dropped. The
tasks involved in a data feed vary from feed to feed. Here are a couple of

1: Get images from data provider A
a. download (ftp) a manifest that describes the available images
b. look at the manifest for images that match certain criteria
c. download (ftp) those selected images to an image server

2: Get images from data provider B
a. execute a query that describes a set of available images
b. download (http) those selected images to an image server

3: TabDelim data feed from data provider C
a. download (ftp) a tab delimeted file from the provider
b. perform a basic validity check against the data ( correct number of
columns )
c. move the file to the data process server ( where it will be parsed and
loaded to a database ).

There are about 30 different feeds. Scripts to perform the current feeds have
been constructed over time in a number of different languages. There is
little inhouse knowledge about these scripts. The scripts are not

I've been tasked with adding a new data feed, but also keeping my eye on
possibly bringing some of these other feeds under the same umbrella.

As I installed Ant to manage the build of my codebase ( Java ) to construct
this new feed, I wondered if Ant might be a good candidate to complete the
original tasks? In other words, try to use Ant as the data collector, only
coding custom tasks where totally necessary.

I see a number of advantages in using Ant, but I may be totally out of line:
1. highly configurable through an XML file
2. plugin architecture for extending the functionality through Java
3. task based methodology, highly responsive to the ever changing business needs
4. lower level tasks can be tested independently of higher level tasks

Is this a totally crazy idea?
import-bot (20211) [Avatar] Offline
Re: Using Ant to Collect Data?
[Originally posted by steve_l]

Hi robert.

Firstly, I should point you to the ant user mailing list
as the best place for general ant discussion -we are there, so are many other
people -the quality of discourse is usually good and you get broad insight.

Secondly, yes, writing 'components' as custom ant tasks is a way to do things.
I have done stuff like this for some image processing functiona, and frankly
an ant task entry point is great for
-configuration via parameters, filesets, etc
-incorporation into a build process.

Note that it is that its harder to debug ant tasks than standalone code.

Also note that ant is a bit weak for 'set' operations. If you want to get a
lot of images from an FTP site, compare them and then fetch the ones that
fail, you may want to (somehow) wrap that in a single custom task. Of course,
this task can create FTP tasks to do its work. Otherwise, get the foreach task
that is covered in chapter 10 and use that for bulk work.