Data Pipeline

part of the ArsDigita Community System by Oumung Mehrotra, Michael Bryzek

The Big Picture

When processing forms with a lot of fields, it is a real pain to write the insert and update sql statements, and to enforce basic validation. It would be nice to have a layer of abstraction that allowed you to not worry about the sql statements and, instead, focus on building the large html form.

It turns out this idea is often good even for small forms. An abstraction of the sql statement means changes only need to be made in 1 place - the html form, and not in the add and edit pages.

The Medium-Sized Picture

We want to focus almost all of our effort on creating the html form. Data Pipeline is based on the proper naming of html form elements. The structure is:
dp.<table name>.<column name>.<data type>
The dp at the start of the field name is used by data_pipeline to identify the elements for data pipeline to insert into some table. Data type is an optional argument that defaults to text.

To process a form, you simply call dp_process. This procedure reads all the variables that begin with "dp" from [ns_getform], and generates an insert or update statement as appropriate.

Arguments to dp_process (using ad_proc):

Supported data types:
The following data types are currently supported. To be supported means that there is some mechanism to validate input. When validation fails, data-pipeline returns with an ad_return_complaint describing the errors. Note that there is currently no way to specify "not null."

Example - a signup form

This example is a bit contrived, but let's say your users fill out the form below. Your data model is:
create table inquiry_contact (
       contact_id             integer primary key,
       name		      varchar(100),
       phone		      varchar(100),
       email  	              varchar(200)
);

create table inquiries (
       inquiry_id             integer primary key,
       contact_id             not null references inquiry_contact,
       comments		      clob
);
Appearance HTML


1 Your name:

2 Your email address:

3 Your phone number:

4 Do you have any comments for us?


<form method=post action=signup-2>
<input type=hidden name=dp.inquiries.inquiry_id.int value=1003>
<input type=hidden name=dp_c.inquiry_contact.contact_id.int value=123>
<p>1 Your name:
<br><dd><input type=text name=dp_c.inquiry_contact.name size=45>

<p>2 Your email address:
<br><dd><input type=text name=dp_c.inquiry_contact.email.email size=45>

<p>3 Your phone number:
<br><dd><input type=text name=dp_c.inquiry_contact.phone.phone size=45>

<p>4 Do you have any comments for us?
<br><dd><textarea name=dp.inquiries.comments.clob cols=45 rows=5 wrap=soft></textarea>

<p><dd><input type=submit></form>

Now you want to process the data. Let's say you just want to make sure name is not empty. Here's the tcl script to do your error checking and to process the form:

set_form_variables 0
# dp variables: contact_id, inquiry_id, name, email, phone, comments

set exception_count 0
set exception_text ""

if { ![exists_and_not_null dp_c.inquiry_contact.name] } {
    append exception_text "  <li> Please enter your name"
    incr exception_count
}

if { $exception_count > 0 } {
    ad_return_complaint $exception_count $exception_text
    return
}

set form_setid [ns_getform]

# Need to create the relation
ns_set put $form_setid dp.inquiries.contact_id ${dp_c.inquiry_contact.contact_id.int}

set db [ns_db gethandle]

ns_db dml $db "begin transaction"

# update or insert the record in inquiry_contact 
dp_process -db $db -form_index "_c" -where_clause "contact_id=${dp_c.inquiry_contact.contact_id.int}"

# update or insert the record in inquiries
dp_process -db $db -where_clause "inquiry_id=${dp.inquiries.inquiry_id.int}"

ns_db dml $db "end transaction"

ns_returnredirect /

The sql trace generated would look like:

select count(1) from inquiry_contact where contact_id=123 update inquiry_contact set contact_id=123, name='Mike', email='mbryzek@arsdigita.com', phone='555-555-5555' where contact_id=123 insert into inquiry_contact ( contact_id,name,email,phone ) values ( 123,'Mike','mbryzek@arsdigita.com','555-555-5555' ) select count(1) from inquiries where inquiry_id=1003 update inquiries set inquiry_id=1003, comments='Nothing to say', contact_id='123' where inquiry_id=1003 insert into inquiries ( inquiry_id,comments,contact_id ) values ( 1003,'Nothing to say','123' )

Things to note: we only use ns_ora clob_dml if the clob you're inserting is indeed longer than 4000 characters. Note that the "select count(1) ...." seems redundant when we could use ns_ora resultrows. Unfortunately, resultrows won't work after [ns_ora clob_dml ...].

Future Enhancements

We need to add more datatypes. Namely, better date handling is important and it would be nice to incorporate validate_ad_dateentrywidget and the new validate_ procs in the utilities file. Also, we'd like to expand the error checking to allow for a way to specify not null.
written by mbryzek@arsdigita.com in March 2000