How can experiments be more systematic and comparable?
Participants: Arno, Chuang, Guoli, Matteo, Michael
What is the State of the Art?
- Creation of "meaningful" distribution (e.g., Zipf distribution)?
- Use data from search engines -- didn't work too well
It is possible, though. I will put a simple generator online when I get back home and correct the code - for now it is only ugly bash+awk. For the general idea you can take a look here: http://doi.acm.org/10.1145/1266894.1266939
Use PlanetLab for realistic processing and communication delays
We need also meaningful real-life topologies. While one can use a plethora of available generators real life data is always applicable. As just one of possible starting points you could look here: http://www.opte.org/
-> What are others doing?
What Are We Striving For?
- Have large data-sets for different scenarios
- Find models for distribution of events, subscriptions, etc. and extrapolate
- Define benchmarks based on the scenarios/models
-> Anything important missing?
What can WE do?
- Publish data used for evaluation on the website!
- Publish workload generator/simulation used!
Use pub/sub and produce our own data
-> Has this already been done?
Where can we obtain realistic workloads and data sets?
- For a start:
NYSE Data -> No subscription information
- Use traces from peer-to-peer systems (e.g., Gnutella, traces from University of Washington)
- How can they be converted to be used for our purposes?
Look at the benchmark suits currently developed there [UPDATE: Matteo]
- Information filtering/retrieval benchmarks (Annika?)
- Can we come up with a kind of game to gather data?
- Workload scheduling trace
- Intrusion detection systems may provide realistic data (e.g., Snort)
- The Gryphon project has data used in papers
- Ask TIBCO for data
- Use information from applications that build on pub/sub
- Ebay as a potential source (scrape data and publish)
- Use information from business processes
- Can we instrument games (like multiplayer games where characters subscribe to events in their neighborhood)?
Can we use data, e.g., from WebCQ
- Can we use information extracted from weblogs?
PlanetLab as source for processing/communication delays?
AOL Log data available here: http://www.gregsadetsky.com/aol-data/ If it goes off-line I can send you a CD (-- ZbigniewJerzak)
What Data Do We Need?
- How are subscriptions distributed/look like?
- Predicates, attributes, values
- What about composite events?
- How are publications distributed?
- Message rates?
- Subscriptions, publications, and meta-data
- Locality of interest?
- How does the topology look like?
- Broker degree, connectedness, communication delays, bandwidth
- How are clients joining and leaving the system?
- All this is depending on the application!
-> Anything missing?
What benchmarks ''do'' exist or ''should'' exist?
- There is an EU project WASP with a work group on benchmarks called "Network-level benchmarks"
There is work on EP application scenarios (cf. Dagstuhl Seminar) [UPDATE: Arno]
- "Application kernels" to modularize benchmark
There is an EU project on benchmarks for EP (rule-based?) systems [UPDATE: Arno]
How have other communities developed and adopted benchmarks?
- There is a VLDB paper on benchmarks for SP
- Benchmarks for JMS (Alex Buchmann?)
- TPC / SPEC benchmarks
- Peer-to-Peer
-> Are there any benchmarks driven by academia?
What are realistic models for workload generation?
What are good performance metrics?
I would like to draw the attention of the DEBS community to our paper titled "Constructing scalable overlay for pub-sub with many topics", which is published in PODC'07. The paper is available from http://www.ifi.uio.no/~romanvi/Papers/scalable-overlay-theory.ps This work is decidedly not about a new pub-sub system; it rather attempts to formally capture and theoretically analyze a fundamental problem of building and evaluating pub-sub overlays. Since many existing pub-sub systems have been tackling this problem from the practical standpoint, perhaps this paper can be considered a (rather small) step towards creating the unifying theory of pub-sub. Specifically, we believe that our work provides the following potential benefits for the DEBS community: 1. It includes and can be further extended toward evaluation criteria for pub-sub overlays. This may be relevant for the effort of creating commonly used pub-sub benchmarks. 2. It determines theoretical limits of what a practical pub-sub system designer should strive and can hope to achieve. In particular, it includes a nearly optimal centralized algorithm for building an overlay, which can be used as a baseline for distributed implementations in practice. The current paper version only targets topic-based pub-sub. Since this is a conference version limited in length, the list of references is very far from being comprehensive. In particular, we did not cite any major work on content-based pub-sub. We do intend to compile a comprehensive list of citations for the full version of this paper. This is an additional reason why feedback from the DEBS community would be so useful.
