!
After manual verification, an AI wrote this article.
Introduction# When editing TEI (Text Encoding Initiative) XML, in addition to structural validation of elements and attributes, more complex business rule validation may be needed. This article explains how to combine RELAX NG (RNG) and Schematron to achieve both structural and content validation, using challenges encountered in an actual project as examples.
The Problem to Solve# When editing classical Japanese literary texts in TEI XML, the following requirements arose:
Dynamic validation of ID references : Validate that IDs referenced by corresp attributes actually exist in witness elements within the documentCompletion functionality in Oxygen XML Editor : Automatically display ID candidates during editingMultiple ID reference support : Allow specifying multiple IDs separated by spacesRestricting references to specific elements : Only allow references to witness element IDs, and error if person element IDs are includedWhy RNG + Schematron?# RELAX NG Strengths# Element and attribute structure definition Data type specification Basic content model definition Schematron Strengths# XPath-based complex validation rules Cross-reference checks within documents Custom error message provision Combining these two enables strict validation from both structural and content perspectives.
Implementation Examples# 1. Basic RNG Schema Structure# < < ? g / x r < < < < < g m a ! s ! s / ! r l m - c - t < s - a m - h - a r t - m v a : r e a m e r S n E t f r S a r c s m > t t r s x x x d n h b n > r > i m m m a s e p e a u o l l l t = m r d m c n n n n a " a e e t = s s s t h t f S = u " = : : y t r i c " r 1 " a s p t o x h T a . h = c e p n = e E l 0 t " h L : " m I " t h = i / n t a " d p t " b / a e t / e e : t h r w m i r > f n / p t a w e " o i c : t r w s n n r / p y . p u i d e / : = t a r r t i l r / " e c i u i n a e / h i e = l o g x l p t - " e n = n a u t c d h s " g x r p . e t b U . n l : c t h y T o g . r l p e F r . / g a : r R - g c w / r / e N 8 r l w n a / G " n g c w s t w ? s . . / i w > n o w 1 o w s s r 3 . n . t / g . 0 t r c o " e u d r > i c m s g - t p d / c u a l 2 . r t / 0 o e i s 0 r / b c 1 g 1 i h / / . l e X n 0 i m M s " t a L / y t S 1 / r c . a o h 0 n n e " n " m / o a > t - a d t a i t o a n t s y / p 1 e . s 0 " " 2. ID Definition and Use of anyURI Type# Use the anyURI type to achieve auto-completion in Oxygen XML Editor:
< < < < ! d / ! d / - e < d - e < d - f e / e - f e / e i l e f i l < < e f W n e o l i B n e a / t l i i e m n < e n a e m t < < a e e n t e e e n m e s e t a / l / t x m e n n n O l < < e e e > e n n r : R I I a i l t t e > e a t r e a / t l O n a t i d e n n : s o i r / n s m M m t < a e e r t t m b o f t d t n < o s i > t s e n o e t d t x m M > e e n u c e e O o > e d n t b > = a r n r a t t e o x = a t u r r x c O a e > u l " m e t i t r / n r t " m e m e n y u r t O t i l e > b a i > t e l e e n a g m M a r e s i = n u b > > r e = n n c l e e o M > t s " a t t u e m " a t e n n r t o t l m e y t a " l m a r , t e y r W i e p e d > e e t t e a > p e i s = n e > i m = i o f a t e > t t " a = n " " o e i = " W w m " g > c n w r l o " > i i e I o > i e i n a t t = D r t n s > n " n " " r n c t y > e x / e e e U s m > s s o R s l p s i f I " : " e n " > i > s x / d I m > " D l > R : E i F d f w o i r t m h a t # w i i s t h d i # s p l a y e d Key points :
data type="ID" guarantees uniquenessdata type="anyURI" allows internal references with #The list element allows space-separated multiple values 3. Advanced Validation with Schematron# < s / c < < s h s s / c : c c < < < < < < < s h p h h s s s ! s / ! s / c : a : : c c c - c T A s - c T D " s h p t t r h h h - h h v c - h h e / c : a t i u : : : : e a h : e t s > h r t e t l l l l S a i : E r e t : u t r l e e e e h s c l a r e c c r f r ' r l e n e t t t o s o a s r p o t i o e , e e r > c u e r b s o o r e n r t p > n i W o n n n l r r l e r r r d g u ' o > d i n a a a d t e e r t e - $ r r = t t m m m s t i s p j t n t " n e e e e o t p w > f t p e o o > w e x = = = n e i e r i k i t e i s t " " " l s a t p s a s n e f h l t s = l l c y t t n e t t o ( n e s n " i i o = t e r = t n ( n e e I t s s r r " r s s " r i s s D e t t r e e s ) i s o s s ) i I n t $ s i W P e f v a " b n o a " b D a t ) - R : i e s e e t s s u I m t s s u s $ r o , r e l t r p r r i t u r t D I e i t u r t : c t k e f e I s T e y s a b o e s D s a b o e o s e f e m d o o n f r s l : s $ f r s l < r - n e r [ s n k c $ i t t e s t i t t e c s r w r e @ " I e e t e s r = h # a o e s r = o c e i e n c d n o s - i " o < r k s - i " n h s t n c o v s s w k w n e u s e e w n e t : p h c e r a " " i e ( i g r l c n ( i g r a v T ( e r l t n t ( r d h i t ( r i a o $ s V e u v v n h $ o : n i h $ o n l k t " a s e a a e i ( t r o v c n ( t r s u e o > l p = l l s n $ o " n a l $ o " e n k i ] " u u s t k > l l u $ t k > p - s e d " / e e e $ o e y u d c o e e o n a > / = = s c k n e e o k n r f , t t " " o e , r - d r e , s i e / t r n e o r n o s ' o i / o r , 2 f f e , 2 n e # n : t k e ) e s ) l ' < l e e s ' r s p ' I e ) / i i n p # = e e T # = D c s s : i T ' n l o ' s t a c t l z o ) $ c e k ) $ . = n h W i e k l e c e l " d : i s ( e a i t n a i t t t n n n s w = s n s s i / P o s d t i " d t u t t e r W t s P b l e r m i n t e s e i s a t e r r t > : o l I s i s r w n i d s n o i i / z s g n n t t e I - I g n e - D j d ( e i s s o s $ s : p . i t s p a n o / e c ( k @ r e $ e x s ( l n m o @ i , l n c s : / o t 2 i @ r W ) d x r i " m e t = / l s I > : p d $ i ) s l d , , i " s / ' t > \ , P s e + # r ' ' s ) ) o " " n / / I > > d s ) Key points :
Define variables with sch:let and dynamically retrieve values with XPath Parse multiple ID references with tokenize() sch:assert raises errors when conditions are not metsch:report raises errors when conditions are metrole="error" specifies the error level (warning and info are also available)4. Actual Usage Example# < < < < ! ? ? T / - x x E T - m m I E l s l s < < I U - c - c x t / t / > s m h m h m e t e t a o e o e l i e x e g d m d m n H < < i t < x e e a e a s e l / l / H > b / t l t l t = a i l i l e o b > i y y " d s i s i a d o n h p h p h e t < < s t < s d y < < d r e r e t r W w w t P p / t e > a / a / y X e n e n t > i i i W e e p P r p a p a > M f s f s p t t t i r r e e > p p p p L = = = = : > n n t s s < r r > < < < p > < < < p " " " " / e e > o o p s s ! l r > ! l r > d s h s h / s s n n e o o - e d - e d o c t c t w s s > r n n - m g - m g c h t h t w x s > > > u e p e p w x x m N C c c E c A m m : m : . m m l a o o o r o l e a a t l l : m r r r r r t n . / . / e : : i e r r r o r e t r r r p i i i d > e e e r e r n e n u - d d = P c s s s n g l g r c = = " e t p p e p a " a " l . " " a r = = x = t x . o a i b s e " " a " i t n t o r a i c o x # # m # v y g y c g a i " n a a a p a e p . p l / " " > m a a l a e o e c n > > A p a a e a r = r = . s W W B l " : e " g " o / i i C e # > # a a / a r 1 t t < : i A i a d p n p g . n n / i l n b i p s p / 0 e e p r i t c c n l / l d " s s e e " e l " g i s i s > s s r f > r u > < c t c d s e M n d M / a r a l A I N r a a i a r t u t / < < a e i t n i d i c i s / / m n n i g n g o t o c w w e c v > n u n h i i > i t e p t / r / e t t n e e e x e x m n n g x r r x m / m a e e t e s t l 1 l t s s o < a o < " . " r s s n / d n / 0 o > > l l i l " n y e n e ? " m g m > ? w > < > > i / t r n d e g s > s e s
Implementation Notes# 1. XPath 2.0 Syntax# Pay attention to the for expression syntax in XPath expressions within Schematron:
< l < l ! e ! e - t f r - t - o e - $ r t l r $ C i u e e W i o n $ r t t i n r v t n u l v r a o $ r l a e l k i n l c i e d c i t d n i a d : f u : i = s : = n ( e = s $ ( $ u i a f l r c b d n o e e o s r t t r t = e u r r r $ $ r e i $ r t i n s n v o o d p g a r k i T ( l e : f o $ i n = k t d ( e o I i s $ n k d n u i s e s b d n ) $ s , c t = t o r 2 h r i $ ) e r n v n e g a s ( l p $ i T t d o o I e k k d l e e s s n n ) e s , t $ 2 h t ) e o n k e n e l s e $ t o k e n 2. IDREF vs anyURI# IDREF type : Cannot include #, limiting completion in OxygenanyURI type : Allows values with #, and Oxygen automatically provides ID completion3. Schematron’s role Attribute# role="error": Red error markerrole="warning": Yellow warning markerrole="info": Blue information markerApplication Examples# Complex Cross-Reference Validation# < s / c < < < < s h ! s / ! s / c : - c < s - c < < s h p - h s / c - h s s / c : a : c a s h : c c r s h p t a r h p c : r r h h d c : a t p u : p h r d u : : g h r t e p l a : u g l l a : u t r e s e a l e e s e a l e n e s l s e e t s l s e r l c e e s > l c e e s > n i e o r m e e o n r m e > d m n t e r m n a t e r = e t n t e t m n t " n e t t > n e e t t > c t x e t x = e ' r t s m ' t " s s o m = t u s = l t s u " = s " e = c s s t " t c t m " o - t e c o e C n r r i o h r i o o r e h : u a r : r t e f a a n v e r r ( s e v p t e s d e @ p r e p ( p g s c e " t e [ p o m n e > e x m @ " r u c x i a u c r s e a : c s o v e t s c l t t r a s " t e l r l p b > l m y n e u e y ) o s e = o t p = d o = n ] " $ i n e d " . l f e 1 u > . e f " l p / m e l > e l t C r e m i e o e m c i r n e a : r t e l t l e l e e e s f e m m p r m e l / ) o e n e @ " m n t m c > t o l e r e l r m e e m s e e p l n " e t / m ' > e s n t ' s Conditional Required Attributes# < s / c < s h s / c : c < < s h p h ! s / c : a : - c w s h p t r - h h c : a t u : e h r t e l w a n : u t r e h s a l e n e s a s e r c n e t s > n i o r t e > d n a t r r = t t i t " e t t b > c x r e u o t i s t n = b t e d " u = i t t " m t e e m u i i a s o : m t t n d u c a a s h b l t t e e - e s a [ b ( s t @ e @ p t w w e r h i h c i e n e i b n n f u ] I , i t " S e e > O d s " f \ i > o d n r { m 4 Y a } Y t - Y \ Y d - { M 2 M } - - D \ D d { f 2 o } r $ m ' a ) t " > Summary# By combining RELAX NG and Schematron:
Separation of structural and content validation : Design leveraging each tool’s strengthsDynamic validation rules : Flexible validation based on document contentEditor support : Advanced editing assistance in Oxygen XML Editor and similar toolsClear error messages : Custom messages in any languageEspecially for editing documents with complex structures like TEI XML, this combination becomes an extremely powerful tool.
References# The complete schema code introduced in this article is from an actual project. I hope it serves as a reference for those facing similar challenges.