Subcommunities

by Michael Yoon, Lars Pind and Jon Salz

ACS Documentation : ACS Core Architecture Guide : Subcommunities


ACS4: Work In Progress

The Big Picture

Most online communites actually become collections of discrete subcommunities. For example, a corporate intranet/extranet serves both units within the company (e.g., offices, departments, teams, projects) and external parties (e.g., customers, partners, vendors). The ACS enables you to provide each subcommunity with its own "virtual website" or subsite, by assembling modules that together deliver a feature set tailored to the needs of the subcommunity.

A user can be a member of more than one subcommunity (an employee is located at an office, works on one or more projects, and is a member of a team, within a department), so subsites provide an intuitive partitioning of information (to find out when a project is scheduled to launch, go to the project subsite; to learn about benefits, go to the HR department subsite; etc), improving the usability of the site as a whole.

This does not imply that navigating subsites is the only way for users to get information from subcommunities. Rather, the ACS will also provide a user-centric portal (along the lines of My Yahoo!) that integrates relevant content from all subcommunities to which the user belongs. The real challenge is figuring out what is actually relevant to the user; doing this is what's known, in industry terms, as personalization. (New Stuff is a simple example of personalization, based on the assumption that most users will be interested in new content.) User profiling provides the foundation for really effective personalization.

In practice, "subcommunity" is simply a fancy name for "user group." Accordingly, we use the user groups data model to represent subcommunities.

Site Specifications

A site specification is simply a collection of modules that together comprise either an initial template for subsites or the actual definition of a subsite.

For each type of subcommunity (i.e., user_group_type), you can define a site specification, to designate modules that you believe any subcommunity of that type will find useful. For example, to build a family collaboration service along the lines of MyFamily.com, you (as the site administrator) would:

  1. Define a "Family" user_group_type
  2. Construct a site specification for families by choosing from the menu of available modules, perhaps selecting Discussion Forums, Chat, Calendar, Address Book, File Storage, Bookmarks, Photo DB, and Webmail
To build an ecommerce ASP (application service provider) like Yahoo! Store or Amazon.com zShops, you would:
  1. Define a "Merchant" user_group_type
  2. Construct a site specification for merchants, selecting Ecommerce only

The data model for site specifications consists of two simple tables:

create table ad_site_specifications (
        site_spec_id    integer not null
                        constraint ad_site_specs_pk
                        primary key,
        on_which_table  varchar(30) not null,
        on_what_id      varchar(30)
        constraint ad_site_specs_uk
        unique (on_which_table, on_what_id)
);

-- on_what_id is not numeric because the primary key of the
-- user_group_types table (the group_type column) is a varchar;
-- should we add a numeric primary key to user_group_types? ugh.

create table ad_site_spec_modules (
        site_spec_id    not null
                        constraint ad_site_spec_mods_spec_id_fk
                        references ad_site_specifications (site_spec_id),
        module_key      not null
                        constraint ad_site_spec_mods_mod_key_fk
                        references acs_modules (module_key),
        constraint ad_site_spec_mods_pk
        primary key (site_spec_id, module_key)
);

-- We may want to add audit trails for these tables

-- Note: The above tables supersede the existing
-- user_group_type_module_map table
To define a site specification for a given user_group_type like "Family" or "Merchant", we insert a row into ad_site_specifications with "user_group_types" as the value of the on_which_table column and the appropriate group_type in the on_what_id column.

Subsites

When a subcommunity (user_group) is created, a corresponding subsite will be created automatically if a site specification exists for that type of subcommunity (user_group_type). The modules designated by the specification are installed in the subsite and made available for immediate use. This is what we mean by saying that a site specification can function as an initial template for subsites ("initial" in the sense that subsite administrators can remove default modules and/or add new ones).

Subsites themselves are also represented as site specifications. For example, a "Merchant" subsite would be created by copying the site specification for the "Merchant" user_group_type, resulting in another row in the ad_site_specifications table (with "user_groups" as the value of on_which_table and the appropriate group_id as the value of on_what_id) and all the corresponding child rows in ad_site_spec_modules.

One important issue to consider is change propagation, i.e., what, if anything, happens to a subsite (user_group site spec) when its template (user_group_type site spec) is modified, e.g., a module is added or removed. The simplest model is to have subsites branch irrevocably from their templates upon creation, so that no changes made to templates are ever propagated to subsites. This is what the ACS implements.

A more sophisticated and complex change propagation model, which the ACS does not implement, is presented below, in Appendix B.

Special Cases

We treat the site-wide community as a degenerate case of subcommunity, represented by the predefined "all_users" user group and corresponding subsite. In this way, modules can be installed for the site as a whole, just as they are for subsites.

On the opposite extreme, another degenerate case is the subcommunity of one, i.e., the individual user. By allowing for personal subsites, we can use standard modules to provide a Yahoo!-like suite of integrated services:

Calendar  -  Yahoo! Calendar
Address Book  -  Yahoo! Address Book
Webmail  -  Yahoo! Mail
Publishing  -  Yahoo! GeoCities
Personal subsites are represented by rows in the ad_site_specifications table, with "users" as the value of on_which_table and the appropriate user_id as the value of on_what_id.

A predefined site specification (with an on_which_table value of "users" and an on_what_id value of null) is used as the template for all personal subsites.

Finally, not all modules make sense in the context of a personal subsite (e.g., "information pushing" services like News and collaborative services like Chat or Discussion Forums), so the Package Management metadata for each module should explicitly state whether or not it can be used in a personal subsite.

Subsite Administration

Overall responsibility for adminstration of each subsite belongs to either users in the "administrator" role for the corresponding user group or, in the case of personal subsites, the owning user. We represent this responsibility with a general_permissions row granting "administer" permission on the appropriate row in the ad_site_specifications table.

Responsibility can be delegated at either the subsite level (by granting the "administrator" role to other group members) or the module level (by inserting general_permissions rows granting "administer" permission on the appropriate row in the ad_site_spec_modules table).

(One implication of this model is that we could decide to replace the predefined Site-Wide Administration group with the "administrator" role of the all_users site spec. Practically speaking, this might prove challenging.)

(Also, this model does not make sense with personal subsites, where the user experience should be seamless, whether performing what are normally considered "administrative" actions or regular end-user actions.)

Subsite URLs

The URL of an ACS subsite consists of two parts:
  1. the hostname
  2. the subsite's "page root" (i.e., the path under which all pages in the subsite appear)
By default, the latter part (the subsite page root) is:
/<group_type_plural>/<group_name>/
with subsite-wide administration pages at:
/<group_type_plural>/<group_name>/admin/
The path for a given subsite module is:
/<group_type_plural>/<group_name>/<module_url_stub>/
with module-level administration pages at:
/<group_type_plural>/<group_name>/<module_url_stub>/admin/
So, for instance, the URL for the Boston office's subsite on our hypothetical corporate website would be:
http://www.company.com/offices/boston/
and news specific to the Boston office would be found at:
http://www.company.com/offices/boston/news/

Since the virtual directories that correspond to user_group_types appear directly under the actual page root, there is the possibility of name collision with directories in the filesystem. To address this, we log an error for each collision detected when the server starts (which Watchdog will then report to the site administrators) and give precedence to the real directories.

The aforementioned special cases of the public site and personal subsites are handled differently:

The public site is served from directly under the page root. Continuing our example from above, company-wide news would appear at:

http://www.company.com/news/

As for personal subsites, their URLs take the form:

/users/<user_id>/
or (if the user has chosen a screen name):
/users/<screen_name>/
(Should we also support /users/<email_address>/?)

Under the Hood

Clearly, the Subcommunities architecture has both design and implementation implications for the rest of ACS. Now, every page must ask the question, "Where (i.e., in which subsite) is the user?" For each page request, the answer to this question is key to providing the appropriate response. For example, a visitor to /news/ should be served only public news (i.e., news items for the all_users pseudo-subcommunity), while a visitor to /offices/boston/news/ should be served only news items specific to the Boston office subcommunity.

In order to make this work, we need to be able to relate entities to their enclosing subcommunities, which should be a straightforward process:

  1. Add a foreign key column that references ad_site_specifications in each "top-level" table (i.e., tables that do not already have a parent table, such as bboard_topics).
  2. On each page, use the ad_conn API (to be determined) to identify the enclosing subcommunity.
  3. Include the corresponding site_spec_id as a criterion in any query against the top-level table(s).
For example, by default, the Ticket Tracker index page presents a full listing of tickets assigned to the logged-in user. This page will be enhanced to filter out any tickets that do not belong in the subcommunity identified by the "virtual page root" portion of the URL: "/" for the all_users subcommunity, "/users/123" for the personal subsite of user #123, etc.

More "how-to" info and examples coming soon

Sharing Data Between Subcommunities

While there are some communities where data will rarely, if ever, cross subcommunity boundaries (e.g., our hypothetical ecommerce ASP, in which each subcommunity is an independent merchant), many others do need to share data between subcommunities.

For instance, an employee's contact information should be available both in the company directory (i.e., Address Book) and (at least, in part) on the subsites of her project(s), for access by the customer. Another example is Calendar: calendar events should "cascade" down, so that an employee's calendar is comprised of events from every level up the organizational hierarchy.

If we make the simplifying assumption that each entity has one and only one owner, what remains is to establish a many-to-one (instead of one-to-one) relationship between subsite and entity, so that, for instance, an address book entry "owned" by the company-wide directory (all_users subcommunity) would be viewable in the context of other subsites. We can do this generically with a table like:

TBD...

Appendix A: Hierarchical Subcommunities

In reality, many communities contain not one but many levels of subcommunity that together comprise a hierarchy. For instance, at ArsDigita, the Operations department consists of teams, each of which is responsible for multiple projects. It makes sense to represent "department", "team", and "project" as user_group_types.

The current definition of the user_groups includes a parent_group_id that allows us to establish these hierarchical relationships between actual user groups, but does not address two separate but related needs:

  1. Relationships other than a singular parent-child relationship, e.g., there are multiple parties involved in an ArsDigita project: the team, the customer, potentially one or more partners (graphic design firms, etc.). In the context of our intranet/extranet, each of these parties is a user_group, and the project itself is a user_group. The parent_group_id column alone is not enough to represent all of these relationships, so the current data model would force us to extend the automatically-generated project_info data model, e.g.:
    create table project_info (
            ...
            customer_id     not null
                            constraint project_info_customer_id_fk
                            references customer_info (group_id)
    );
    
    create table project_partner_map (
            project_id      not null
                            constraint proj_partner_map_project_id_fk
                            references project_info (group_id),
            partner_id      not null
                            constraint proj_partner_map_partner_id_fk
                            references partner_info (group_id),
    );
    
    There is no reason that these relationships cannot be stored generically, just as all manner of user-to-user-group relationships are stored in the user_group_map table. We accomplish by introducing a new table, user_group_relationships (better name?):
    create table user_group_relationships (
            insert two group_id columns and a descriptor
            for the relationship here
    );
    

  2. Relationships exist not at only the user_groups but also at the user_group_types level. Our data model should be able to express business rules like "every project must be assigned to a team" and "every department must belong a company":
    create table user_group_type_relationships (
            insert two group_type columns and a descriptor
            for the relationship here
    );
    
Extending our data model to address these two needs has a number of implications.

First, the user_groups uniqueness constraint would change from just short_name to short_name and parent_group_id. Thus, each node in the subcommunity hierarchy becomes its own namespace, like directories in a filesystem, instead of the single flat namespace for user groups that exists today.

Secondly, it would probably make sense for subsite URLs to reflect the subcommunity hierarchy. For example, consider an outsourced intranet service where company is the top-level subcommunity. Multiple companies might have offices in Boston, and the subsites for those offices should be located at unambiguous URL paths such as:

/arsdigita/boston/
and
/greylock/boston/

Appendix B: Change Propagation

Maintaining a "live link" between the user_group_type site spec and its children makes it possible to do nice things like add a new module at the user_group_type level and have it become instantly available to all subcommunities of that type. The price is the extra complexity of handling situations like: a module is removed from the user_group_type site spec; what happens to all the dependent user_group site specs?

To support this model of change propagation, we would probably add more columns to ad_site_specifications, e.g.:

create table ad_site_specifications (
        ...
        parent_spec_id      constraint ad_site_specs_parent_spec_id_fk
                            references ad_site_specifications (site_spec_id),
        sync_with_parent_p  char(1) not null
                            constraint ad_site_spec_sync_w_parent_p_cc
                            check (sync_with_parent_p in ('t', 'f'))
);
Of course, there would also need to be code and user interface to handle conflict cases like the module removal scenario above.

API


michael@arsdigita.com