Annotated XML: Queries and Provenance

Please download to get full document.

View again

of 10
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Information Report
Category:

Recipes/Menus

Published:

Views: 4 | Pages: 10

Extension: PDF | Download: 0

Share
Related documents
Description
nnotte XML: Queries n Provenne J. Nthn Foster To J. Green Vl Tnnen Deprtment o omputer n Inormtion Siene University o Pennsylvni BSTRT We present orml rmework or pturing
Transcript
nnotte XML: Queries n Provenne J. Nthn Foster To J. Green Vl Tnnen Deprtment o omputer n Inormtion Siene University o Pennsylvni BSTRT We present orml rmework or pturing the provenne o t ppering in XQuery views o XML. Builing on previous work on reltions n their (positive) query lnguges, we eorte unorere XML with nnottions rom ommuttive semirings n show tht these nnottions suie or lrge positive rgment o XQuery pplie to this t. In ition to trking provenne mett, the rmework n be use to represent n proess XML with repetitions, inomplete XML, n probbilisti XML, n provies bsis or enoring ess ontrol poliies in seurity pplitions. Eh o these pplitions buils on our semntis or XQuery, whih we present in severl steps: we generlize the semntis o the Neste Reltionl lulus (NR) to hnle semiring-nnotte omplex vlues, we exten it with reursive type n struturl reursion opertor or trees, n we eine semntis or XQuery on nnotte XML by trnsltion into this lulus. tegories n Subjet Desriptors H.2.1 [Dtbse Mngement]: Dt Moels Generl Terms Theory, lgorithms, Lnguges Keywors Dt provenne, semirings, omplex vlues, XML, XQuery. 1. INTRODUTION Reent work hs shown tht mny o the mehnisms or evluting queries over nnotte reltions e.g., inomplete n probbilisti tbses, tbses with multipliities (bgs), n those rrying provenne nnottions n be uniie in generl rmework bse on ommuttive semirings (see einition in 2). Intuitively, one o the semiring opertions moels lterntive uses o t while the other moels its joint (or epenent) use. In [16], Permission to mke igitl or hr opies o ll or prt o this work or personl or lssroom use is grnte without ee provie tht opies re not me or istribute or proit or ommeril vntge n tht opies ber this notie n the ull ittion on the irst pge. To opy otherwise, to republish, to post on servers or to reistribute to lists, requires prior speii permission n/or ee. PODS 08, June 9 12, 2008, Vnouver, B, n. opyright 2008 M /08/06...$5.00. semntis or positive reltionl lgebr (i.e., unions o onjuntive queries) n positive Dtlog were eine or reltions eorte with nnottions rom semiring. The sme pper ientiie nonil nottion or provenne nnottions using semiring polynomils (n orml power series) tht ptures, bstrtly, omputtions in rbitrry semirings n thereore serves s goo representtion or implementtions [15]. This work hs opene up number o interesting venues or investigtion but its restrition to the reltionl moel is limiting. One o the min res tht motivtes work on provenne is sientii t proessing. In these pplitions, reltionl t soures re oten ombine with t extrte rom hierrhil repositories o iles. XML provies nturl moel or tree-struture, heterogeneous soures, but urrent systems or mnging XML t o not provie mehnisms or eorting XML with provenne nnottions n or propgting nnotte t through queries. mjor gol o this work is to exten the rmework or semiringnnotte reltions esribe in [16] to hnle nnotte XML t. Besies provenne, our work is lso motivte by pplitions to inomplete n probbilisti XML t. Inomplete XML hs not reeive muh ttention so r (see 8), but signiint work hs been one on probbilisti XML. For exmple, in [27], the unertinty ssoite with t obtine by probing the hien web (i.e., t hien behin query orms n web servies) is represente using XML trees whose noes re nnotte with boolen expressions ompose o inepenent Bernoulli event vribles. Strting rom these motivtions, we evelop n extension o the semiring nnottion rmework to XML n its premier query lnguge, XQuery [11]. Beuse eling with lists n orere XML oes not seem to be relte to the wy we use semirings (see 8), we ous on n unorere vrint o XML. Previous work [16] provie strong eviene tht the ie o using semirings to represent nnottions is robust. In this work, we esribe two new results tht to this boy o eviene: We eine the semntis or lrge rgment o irst-orer, positive XQuery prtilly ll o the etures tht o not epen on orer on semiring-nnotte XML in two ierent wys, n show tht these gree. The irst pproh goes by trnsltion to n extension o the neste reltionl lulus [8] (NR), 1 while the seon uses n enoing tht shres XML t into hil reltion between noe ientiiers, n orresponing trnsltion o XPth into Dtlog. We prove generl theorem showing tht the semntis o 1 Sine NR is use by itsel in vrious ontexts [5, 17], this semntis is o interest even without the onnetion to XML. queries ommutes with the pplitions o semiring homomorphisms. By instntiting our semntis using nnottions ormulte s polynomils over ixe set o vribles with oeiients in N, we obtin our min ontribution: provenne rmework or unorere XML t n lrge lss o XQuery views. We believe tht this rmework hs prtil potentil: it ptures n intuitive notion o provenne useul or sientii pplitions [15], n the size o the provenne polynomils is boune by O( D q ) where D is the XML tbse n q is the XQuery progrm tht eines the view. itionlly, we illustrte two importnt pplitions o nnotte XML: seurity pplition tht shows how to trnser onientility poliies rom tbse to view by orgnizing the lerne levels s ommuttive semiring, n generl strong representtion systems or inomplete n probbilisti nnotte tbses tht use the provenne polynomils themselves s nnottions. The orretness o these systems ollows rom the ommuttion with homomorphisms theorem. In outline, the pper is orgnize s ollows. 2 reviews the notion o ommuttive semiring nnottions. 3 introues the unorere XML t moel (UXML) n the orresponing rgment o XQuery (UXQuery), n esribes our extension o these ormlisms with semiring nnottions. We eer orml isussion o the semntis o UXQuery to 6, but illustrte its behvior on severl exmples. We esribe pplitions to seurity n inomplete n probbilisti t in 4 n 5. The min tehnil results re ollete in 6. There we review NR, esribe its extension to trees (6.1), eine its semntis (6.2), give the ompiltion o UXQuery into this lnguge (6.3), n stte the ommuttion with homomorphism theorems (6.4). 7 presents n lterntive einition or rgment o UXQuery, vi n enoing o UXML into reltions n trnsltion o XPth into Dtlog. 8 esribes relte work; we onlue with brie isussion o ongoing n uture work in 9. The long version o this bstrt ontins the omplete einitions o eh o these systems n is vilble s tehnil report [13]. 2. SEMIRING NNOTTIONS ommuttive semiring (K, +,, 0, 1) is n lgebri struture onsisting o set K, opertions + n, n istinguishe elements 0, 1 K suh tht: 1. (K, +, 0) n (K,, 1) re ommuttive monois; 2. k 1 (k 2 + k 3) = k 1 k 2 + k 1 k 3, n 0 k = 0. s shown in [16], ommuttive semirings n reltionl t it together nturlly: when eh tuple in reltion is tgge with n element o K, the semntis o stnr query lnguges n be generlize to propgte the nnottions in wy tht ptures bg semntis, probbilisti n inomplete reltions, n stnr notions o provenne. n (imperet) intuition or the mening o these nnottions is s ollows: 0 mens tht the tuple is not present or vilble; k 1 +k 2 mens tht the tuple n be proue rom the t esribe by k 1 or tht esribe by k 2; n the nnottion k 1 k 2 mens tht it requires both the t esribe by k 1 n tht esribe by k 2. The nnottion 1 mens tht extly one opy o the tuple is vilble without restritions. In the reltionl setting, it ws shown tht the xioms o ommuttive semirings re ore by stnr equivlenes on the (positive) reltionl lgebr [16]. In this work, we show tht ommuttive semirings lso suie or vriety o nnotte neste t n their ssoite query lnguges. We evelop our theory or rbitrry ommuttive semirings, but use speii semirings in vrious pplitions: (B,,, lse, true): set-bse t; (N, +,, 0, 1): bg-bse t; Positive boolen expressions: inomplete/probbilisti t (see [16] n 5); onientility levels: see 4; Linege n why-provenne (it turns out tht these re ierent n orrespon to ierent semirings, see [4]); (N[X], +,, 0, 1): universl semiring o multivrite polynomils with oeiients in N n inetermintes in X. The polynomils in N[X] provie very generl n inormtive notion o provenne 2 n, in t, pture the generlity o ll ommuttive semiring lultions: ny untion X K n be uniquely extene to semiring homomorphism N[X] K. This t is relevnt to querying sine (s in [16]) by Theorem 1 n orollry 1 below, our semntis or query nswering ommutes with pplying homomorphisms to nnotte t. This yiels the prinipl result o our rmework: omprehensive notion o provenne or unorere XML n orresponing rgment o XQuery. 3. NNOTTED ND UNORDERED XML We ix ommuttive semiring K n onsier XML t moiie so tht inste o lists o trees (sequenes o elements) there re sets o trees. Moreover, eh tree belonging to suh set is eorte with n nnottion k K. Sine bgs o elements n be obtine by interpreting the nnottions s multipliities (by piking K to be (N, +,, 0, 1)), the only ierene ompre to stnr XML is the bsene o orering between siblings. 3 We ll suh t K-nnotte unorere XML, or simply K-UXML. Given omin L o lbels, the usul mutully reursive einition o XML t nturlly generlizes to K-UXML: 4 vlue is either lbel in L, tree, or K-set o trees; tree onsists o lbel together with inite (possibly empty) K-set o trees s its hilren ; inite K-set o trees is untion rom trees to K suh tht ll but initely mny trees mp to 0. In exmples, we illustrte K-UXML t by ing nnottions s supersript nottion on the lbel t the root o the (sub)tree. By onvention omitte nnottions orrespon to the neutrl element 1 K. 5 Note tht tree gets n nnottion only s member o K-set. To nnotte single tree, we ple it in singleton K-set. When the semiring o nnottions is (B,,, lse, true) we hve essentilly unnnotte unorere XML; we write UXML inste o B-UXML. In Figure 1, two K-UXML t vlues re isplye s trees. The soure vlue n be written in oument style s 2 These polynomils n be use, or exmple, to trk provenne in systems or sientii t shring, see [15]. 3 For simpliity, we lso omit ttributes n moel tomi vlues s the lbels on trees hving no hilren. 4 In the XQuery t moel, sets o lbels re lso vlues; it is strightorwr to exten our orml tretment to inlue this. 5 Items nnotte with 0 re llowe by the einition but re useless beuse our semntis interprets 0 s not present/vilble. Soure: ( b x 1 y 1 z x 2 y 2 e 3) y nswer: p z x 1 y 1 +z x 2 y 2 e z x 2 y 3 Figure 1: Simple or Exmple. l L k K p ::= l $x () (p) p,p or $x in p return p let $x := p return p i (p=p) then p else p element p {p} nme(p) nnot k p p/s s ::= x::nt x ::= sel hil esennt nt ::= l * Figure 2: K-UXQuery Syntx. z b x 1 y 1 / x 2 y 2 e y 3 / / where we hve bbrevite leves l / s l. We propose query lnguge or K-UXML lle K-UXQuery. Its syntx, liste in Figure 2, orrespons to ore rgment o XQuery [11] with one exeption: the new onstrut nnot k p llows queries to moiy the nnottions on sets. With nnot k p ny K-UXML vlue n be built with the K-UXQuery onstruts. We use the ollowing types or K-UXML n K-UXQuery: t ::= lbel tree {tree} where lbel enotes L, tree enotes the set o ll trees n {tree} enotes the set o ll inite K-sets o trees. The typing rules or selete K-UXQuery opertors re given in Figure 3. t the en o this setion we isuss this syntx in more etil, n in 6.3 we present orml semntis tht uses the opertions o the semiring to ombine nnottions. In the rest o this setion, however, we illustrte the semntis inormlly on some simple exmples to introue the bsi ies. We strt with very simple queries emonstrting how the iniviul opertors work, n buil up to lrger exmple orresponing to trnsltion o reltionl lgebr query. s irst exmple, let p i = element i {()} or i {1, 2}. Tht is, eh p i onstruts tree with no hilren. The query (p 1) proues the singleton K-set in whih p 1 is nnotte with 1 K n the query nnot k 1 (p 1) proues the singleton K-set in whih p 1 is nnotte with k 1 1 = k 1. We n lso onstrut union o K-sets: let q be nnot k 1 (p 1),nnot k 2 (p 2). The result ompute by q epens on whether 1 n 2 re the sme lbel or ierent lbels. I 1 = 2 =, then p 1 n p 2 re the sme tree n so the query then element b {q} proues the let tree below. I 1 2, then the sme query proues the tree on the right. b k 1+k 2 b k 1 1 k 2 2 Next, let us exmine query tht uses itertion: p = element p { or $t in $S return or $x in ($t)/* return ($x)/* } Γ p 1 : {tree} Γ p 1 : {tree} Γ p 1,p 2 : {tree} Γ p 2 : {tree} Γ, x : tree p 2 : {tree} Γ or $x in p 1 return p 2 : {tree} Γ p 1 : lbel Γ p 2 : lbel Γ p 3 : t Γ p 4 : t Γ p 1 : lbel Γ i (p 1 =p 2 ) then p 3 else p 4 : t G p 2 : {tree} Γ element p 1 {p 2 } : tree Γ p : {tree} Γ p/x::nt : {tree} Soure: ( y 3 b x 1 Γ k K Γ p 1 : tree Γ nme(p 1 ) : lbel Γ p : {tree} Γ nnot k p : {tree} Figure 3: Selete K-UXQuery Typing Rules. y 1 y 2 b 2) x nswer: r q 1 y 1 Figure 4: XPth Exmple. y 2 b x 2 where q 1 = x 1 y 3 + y 1 y 2 I $S is the (soure) set on the let sie o Figure 1, then the nswer proue by p is the tree on the right in the sme igure. 6 Opertionlly, the query works s ollows. First, the outer or-luse itertes over the set given by $S. s $S is singleton in our exmple, $t is boun to the tree whose root is lbele n nnottion in $S is z. Next, the inner or-luse itertes over the set o trees given by ($t)/*: ( bx1 x 2 y 1, y 2 ) e y 3 It bins $x to eh o these trees, evlutes the return-luse in this extene ontext, n multiplies the resulting set by the nnottion on $x. For exmple, when $x is boun to the b hil, the return-luse proues the singleton set ( y 1 ). Multiplying this set by the nnottion x 1 yiels ( x 1 y 1 ). ter ombining ll the sets returne by itertions o this inner or-luse, we obtin the set ( x 1 x 1 +x 2 y 2, e x 2 y 3 ). The inl nswer or p is obtine by multiplying this set by z. Note tht the nnottion on eh hil in the nswer is the sum, over ll pths tht le to tht hil in $t, o the prout o the nnottions rom the root o $t to tht hil, thus reoring how it rises rom subtrees o $S. Next we illustrte the semntis o XPth esennt nvigtion (shorthn //). onsier the query r = element r { $T// } whih piks out the set o subtrees o elements o $T whose lbel is. smple soure n orresponing nswer ompute by r re shown in Figure 4. In 6.3 we eine the semntis o the esennt opertor using struturl reursion n itertion. It 6 tully this query is equivlent to the shorter grnhilren XPth query $S/*/*; we use the version with or-luses to illustrte the semntis o itertion. Soure n nswer s K-Reltions: R B b x 1 b e x 2 g e x 3 Soure s UXML: t x 1 B b Query: R t x 2 B b S B b x 4 g x 5 e Q x x 1 x 4 e x 1 x 2 x 1 x 2 + x 2 x 4 e x 2 2 x 3 x 5 e x 2 3 t x 3 B g D e B b t x 4 let $r := $/R/*, $rb := or $t in $r return t { $t/,$t/b } / , $rb := or $t in $r return t { $t/b,$t/ } / , $s := $/S/* return Q { or $x in $rb,$y in ($rb,$s) where $x/b=$y/b return t { $x/,$y/ } / } / nswer s UXML: t x2 1 +x 1 x 4 t x 1 x 2 e Q t x 1 x 2 +x 2 x 4 t x2 2 e t x 3 x 5 Figure 5: Reltionl (enoe) exmple. S B g t x 5 t x2 3 hs the property tht the nnottion or eh subtree in the nswer is the sum o the prouts o nnottions or eh pth rom the root to n ourrene o tht subtree in the soure, like the nswer shown here. Now we turn to lrger exmple, whih emonstrtes how K- UXQuery behves on n enoing o tbse o reltions whose tuples re nnotte with elements o K (lle K-reltions in [16]). s snity hek, we veriy tht our semntis or K-UXQuery on this t grees grees with the semntis given or the positive reltionl lgebr given previously [16]. onsier the ollowing reltionl lgebr query Q = π (π B(R) (π B(R) S)) n suppose tht we evlute it over K-reltions R(, B, ) n S(B, ) shown t the top o Figure 5. The result,. [16], is the K- reltion Q(, ), lso shown t the top o Figure 5. For exmple, the nnottion on in Q is sum o prouts x 1 x 2 + x 2 x 4, whih reors tht the tuple n be obtine by joining two R- tuples or, lterntively, by joining n R-tuple n n S-tuple. The rest o Figure 5 shows the K-UXML tree tht is obtine by enoing the reltions R n S in n obvious wy, the orresponing trnsltion o the view einition into K-UXQuery, n the K-UXML view tht is ompute using K-UXQuery. Observe tht the result is the enoing o the K-reltion Q. The next proposition sttes tht this equivlene hols in generl. (Throughout the pper we buse nottion n onlte the syntx n semntis o expressions i.e., we write e inste o [e].) e Soure: y 1 t x 1 B y 2 b z 1 nswer: t q 1 y 3 y 1 R w 1 t x 2 B y 2 b z 2 y 3 e z 3 y 1 Q t x 3 B y 2 g z 4 D y 3 e z 5 B y 5 b z 6 t x 4 y 6 S B y 5 g z 7 t x 5 y 6 y 1 y 6 y 1 y 3 y 1 y 3 y 1 y 6 y 1 y 3 y 1 y 3 y 1 y 6 y 1 y 3 t q 2 t q 3 e z 3 t q 4 t q 5 where q 1 = w 1 x 1 x 4 y 2 y 5 z 1 z 6 q 2 = w 2 1 x2 1 y2 2 z2 1 q 3 = w 2 1 x 1 x 2 y 2 2 z 1 z 2 q 4 = w 1 x 2 x 4 y 2 y 5 z 2 z 6 q 5 = w 2 1 x 1 x 2 y 2 2 z 1 z 2 q 6 = w 2 1 x2 2 y2 2 z2 2 q 7 = w 1 x 3 x 5 y 2 y 5 z 4 z 7 q 8 = w 2 1 x2 3 y2 2 z2 4 t q 6 e z 3 t q 7 Figure 6: Extene nnottions Exmple. PROPOSITION 1. Let Q be be query in positive reltionl lgebr, n I K-reltionl tbse instne. Let v be the K-UXML enoing o I, n p be the trnsltion o Q into K- UXQuery. Then p(v), ompute oring to K-UXQuery, enoes Q(I), the K-reltion ompute oring to the semntis in [16]. In K-reltion, nnottions only pper on tuples. In our moel or nnotte UXML t, however, every internl noe rries n nnottion (rell tht, oring to our onvention, every noe in Figure 5 epite with no nnottion rries the neutrl element 1 K). Thereore, we hve more lexibility in how we nnotte soure vlues besies tuples, we n ple nnottions on the vlues in iniviul iels, on ttributes on the reltions themselves, n even on the whole tbse! It is interesting to see how, even or query tht is essentilly reltionl, these extr nnottions prtiipte in the lultions. We hve worke this out in the inl exmple o this setion, see Figure 6. The query is the sme s in Figure 5 but the soure t hs itionl nnottions. Note how the expressions nnotting the tuple noes in the nswer involve mny non-tuple nnottions rom the soure. So r we hve ssume tht the nnottions belong to n rbitrry ommuttive semiring K n we looke t the expressions tht equte q 1,..., q 8 in Figure 6 s lultions in K. However, i we work with the semiring o polynomils K = (N[X], +,, 0, 1) where we think o the soure nnottions s inetermintes ( provenne tokens ) n tke X := {w 1, x 1,..., x 5, y 1,..., y 6, z 1,..., z 7} then the expressions tht equte q 1,..., q 8 re the provenne polynomils tht nnotte the tuple noes in the nswer. This kin o provenne shows, or exmple, tht some o the tuples in the nswer use soure t nnotte with z 1 or y 5 lthough these o not pper expliitly in the nnottions o the nswer ttributes or vlues in the tuples. The nnottions in prtiulr semiring K n then be ompute by evluting these polynomils in K. orollry 1 (ommuttion with homomorphisms) gurntees tht the result will be the sme s tht obtine vi the semntis on K-UXML vlues. Note lso tht we n obtin the nswer shown in Figure 5 simply by setting ll the inetermintes exept or x 1,..., x 5 to 1 n t q 8 e z 5 then simpliying using the semiring lws. When we set these inetermintes to 1, some subtrees whih were istinguishe by nnottions beome now ientiie (q 1 n q 2, q 4 n q 5); this explins the sums in the nnottions o the nswer in Figure 5. The semntis in 6 llows us to prove the ollowing upper boun: PROPOSITION 2. I v is UXML vlue nnotte with inetermintes rom set X n p is UXQuery, then omputing p(v) oring to the N[X]-UXQuery semntis proues n N[X]-UXML vlue suh tht the size o ny o the provenne polynomils tht nnotte p(v) is O( v p ). K-UXQuery vs. XQuery lthough UXQuery only ontins ore opertors, more omplite syntti etures suh s whereluses tht we use in the exmples bove n be normlize into ore queries using stnr trnsltions [11]. For exmple, the where-luse where $x/b=$y/b rom Figure 5 normlizes to: or $ in $x/b/* return or $b in $y/b/* return i (nme($)=nme($b)) then... else () Our lnguge inlues only the ownwr XPth xes, sine the other xes n be ompile into this rgment [24]. To simpliy our orml system, we lso o not ientiy vlue with the singleton set ontining it. This is inessentil but it simpliies the ompiltion in 6.3. In exmples we oten elie the extr set onstrutor when it is ler rom ontext e.g., we wrote $x/ bove, not ($x)/. Unlike these minor ierenes, we me two essentil restritions in the esign o K-UXQuery. The irst hs to o with orer we omit orer
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks
SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!

x