This article explains how to migrate data from Amazon Elasticsearch Service to another OpenSearch cluster. It introduces a simple and reliable migration method using the Scroll API and Bulk API.
Background# The need to migrate data between Elasticsearch/OpenSearch clusters can arise due to cloud service migration or cost optimization. This time, we performed a migration between the following environments.
Source : Amazon Elasticsearch Service (AWS)Destination : Self-hosted OpenSearchMigration Flow# Check indices on source and destination Retrieve and adjust mapping information Create indices on the destination Migrate data with Scroll API + Bulk API Verify migration results Preparation: Checking Indices# First, check the index lists on both the source and destination.
# c # c u u S r D r o l e l u s r - t - c u i u e n " a " i u t u n s i s d e o e e r n r x : : p i p l a n a i s d s s s e s t w x w o o r l r d i d " s " t " " h h t t t t p p s s : : / / / / s d o e u s r t c - e c - l c u l s u t s e t r e / r _ / c _ a c t a / t i / n i d n i d c i e c s e ? s v ? & v s & = s i = n i d n e d x e " x "
Retrieve the mapping information from the source.
c u r " > l h t m - t a s p p s p - : i u / n / g " s s u o . s u j e r s r c o : e n p - a c s l s u w s o t r e d r " / i \ n d e x _ n a m e / _ m a p p i n g " \
Step 2: Adjust Mappings# Depending on the destination environment, custom analyzers may not be available. For example, if the kuromoji plugin for Japanese morphological analysis is not installed, you need to remove the analyzer settings.
d e f r " i e r e " f l e m " i t o R i f u v e s i f f r e c i f o i o n _ u n r s r a r s ' i o n s t a d k r n i r b a i a n e e e s t e j l v n a l y m t e m y e c l , o a m o z l e y o v n v e y ( z b v e c i e r o e j a _ e n _ ( r b r [ l a ( a o e j ' ' u n o o n b m , a e a b b a j o i n l j j l ) v d n a i y , : y : e i l n z z c o y e l e a t b z o r i r n ) j e b ( s ( a : : r j v t i l ' . a ) t y ] i l : e z t u m e e e ) r m ) s s ( e ) t : t i n g s f r o m m a p p i n g s " " "
Step 3: Create Indices on the Destination# Create indices using the adjusted mappings.
i i f d m m r e p p o f o o m r r c # w m c i } r p r t t r r i a l n e r e e e L t p e d s i t j r q a o h p a e p n u s e u t a m i n x ' } ' o f j a t r o q e e d o a n _ _ s , m n " s u ( n n u s _ p p g m b e a s { o t f e t i m e p s a o t p e d n h " r s s n a n i p d t ' ' p e = = C e t . d p ( n = p y i n n i = s i d r s s a e p f g i n u u n t n e e p u x i ' s m n = g m m g r _ d s a o t ( n { _ a g s b b s e u e t t n h s g s d p s { ' e e ' q r x _ e s o o a p : r r : u l _ a e i u f u t i = _ _ e } b u { . m r i r a n { o o c s / o t d s p c l c g r f f l t { d h e t o e e e = s e _ _ e s d y s a r _ _ _ m s r a . e , t t t i i j d o h e n p s _ u n n s a v a p _ u t i s H d d o t e r l m t _ n _ T e e n a _ d i a ( i d c T x x . [ a s c p n e o P , } l s n ' a p d x d B _ o o a : s i e } e a d m a u l ' n x : s e a d r y 1 : g } i i s p ( c z , s " S n c t p f e e 1 , t A _ i ) _ r a [ u i n i ( t 2 t n g n m u 0 h d s d a s 0 e . e p , x j x p { , s ] i r 2 o [ n e 0 d n ' g s 1 e ' m s p ] s ) a ) o t p n _ a p s u s i e r n . l f g s , : s t ' a d ] t e u s s t _ _ c a o u d t e h } ) " : )
Step 4: Data Migration Script# Use the Scroll API to retrieve data from the source and insert it into the destination using the Bulk API.
i i f i S S D D d m m r m O O E E e p p o p U U S S f o o m o R R T T r r r C C _ _ m p # s q } r i d s h t p m e s w # r p r t t r t E E U A i r c u e f a c i o r i r t h e r e e _ _ R U g i I r e s t r t t i g r a i R q i t j r q t U A L T r n n o r p r a o s a n r o r l e u n u s e u i R U H a t i l y " " o e p r l l t a r t e # b f # h b i e # e r e p # s i d s h l e f j a t r o q e m L T = t ( t l s q n s r e = l = ( t s _ u o e u f l l a t r c f a c i e s " s u ( n n u s e H = e f i _ = i u s p i t _ = f e t h P l r S a l s S a t a i G r t r t a t { o t f e t = ' _ " a u z e e o n u r i d " d = i i r k e d k b e h p e n e o s a o s s s S n h " T s s = h H i = l r { e r n t r e d a d T m t e _ h a b b n e _ f d h a u r i m : p e o s = t t l f j a c b l e . O = = \ r t . ' t T n = i l " y = s ( n s t a o = 0 e s p b i c u u d r r " a e u l e f i r r w e = ( l " s u r r = l = d U { S n u s a h H t T d = z : " e f p = a t t : a o t t l l s e { t a t k s g i r d ( f f n _ { o t o e _ S e R " O \ e u t T p P e e = : r . " F o [ a a 0 = r d i k k B s D a d h _ u r r n o p m t " " e r S n h l a s i d c l C s U n t t T s B x M b e s E a n d ' [ l e y i o _ _ u = p E = e = r l e f a t r r = i o \ - x e O = = l k c d a r e E c R C h p P : a ( i S f a { q t r l s a h ' t n n b b l o S b r D e t s o t ( s o g t r t s U { S _ r t o t _ r C o s B / s s g c " t " u a r s e t i h d i B = o o k { n T u s E s u r e f g t r a P { p R " O r o = a l e U o E m i : a / i o r r { c m e t o e . a t i o m u h = d d " s _ l = S p = l d " + r i a l r r b o C s U e l [ l ( R l _ p m / s d c u a o S h a s u r j [ s t c e l " i y y r C e U k h T o t i i B = e m t o a a n E c R s l d ' L l A l p / i e A r t l O _ t t s : s ' ' s u . k " t { e o R _ e _ n b . t f + u s e e - g t t s _ r C p _ a h c } _ U e o s c s u c i l U s c s _ o _ ] ' m t s " + + q n = L b a A s u g e = l l s . d r e c e U o E o r t i o / i T t r o A t t e n R i h . c { n s [ ] e i r : i = = u t } d U e l e m ' k e t m e : h R l _ n e a t n _ d H e t u u - h _ g A C z _ p o r ( c ' [ n m e n e e r d e T . k t e e l n i / i s . = L l A s s [ s t s " d r t c ( i P E e a o d e ) r h ' t e q d j j s n e _ y r H s _ ( i r r e e ( m g s 0 } " U e p ' ' e e : : H c h l ' n { I _ , l s e s o i t s ( u e s s t t q b , s t r ' n r r n r h e e r : f r / : T . o _ ] x a T e ( u u d s U l t p l t o : ) e x o o - u u , a e e o o ( r i ( l a } e _ H s n s [ t r s { T - ' s s e o R " ( ! o l s t s " n n T e l t s r r r r h o t ) a t { q s " t s c ' c c m P c u t e x u L : s = n _ ' a { t : . . y s k u p r e ' s i r s p e m d u e 5 a e r h h r i B l s e r , r } c s i ] l t d d p t " s o o s t : ) - s d i o e a m t . o i / o g a u e r ' c / { r 2 e d ' o { u u e s , _ n r u i + s e ) g c s r " u j l t s l r s s r ' , d e { } o 0 . ' ] t " m m " . c s s l n = ) { s d r s t c , s s l s c l a i t ' e _ s } l 0 t ] [ a _ p p : p o e ' t b t / a / s h _ o _ ' r _ t c e , ' s i o l : e ' l i s s o d . ) [ i 1 u a i t s . / " c n i ] o i e A r p t n u _ x v } n ( ( " s e j : ' t l r f r e e p s s o ( d l d d u ' ' a _ d r u t a " d a h a t s i e k t a d c o c c d ) ' l } } t p s i e c r } l ) e c i p ( = o t m _ _ e t } s r r e ] " , h a s n x e l " u x t t p = n e . r t l e / - t o o , d s w d } _ , ) e " i [ l ( m g e i a { ( l l ! o s o e i ' : o ' i 2 ) s e s m p i t E l l = c w r x n j ] n _ c 0 ' t p e s f o T " _ u o d , d s d ) s a 0 ] ( o e t A , i 2 m r ' e o e o t : : ' n d r a : d 0 e d ) b { x n s + u i i s a l " 0 n ' a d } = t r o n e > t } { : : t ) t e / q _ " c n d . e e s c s _ u i e / e t 0 ( t s , h t s e n n ' x x e > { a c _ _ e r d " ] - ' x e m : r { s i a y e ) n , t l 0 i . o e i n r , x d [ s g 0 l r z d c , + j { : e e r f l r e e h a s } 2 l a } _ o = x ? u " " o ) 0 0 s t s i r 1 } s t _ \ n : 0 e e " d s 0 c h i n " ] d , } } 0 = r = d " } } 0 * , 0 = o S " " 1 e e ) = l O : ) 0 n r : " l U 0 d r ) = R h / = o 5 C i / ' r m E t t ' s " _ [ o , " A ' t ) U _ a f T i l l H d } u ) ' % s ] ) h } = } " T r u e ) Step 5: Execute Migration and Verify# # m m m # c i i i u E g g g V r x r r r e l e a a a r c t t t i - u e e e f u t _ _ _ y e i i i " n n n m u m d d d i s i e e e g e g x x x r r r ( ( ( a : a ' ' ' t p t i i i i a i t t m o s o e e a n s n m m g w s s e r o 1 2 s e r ' ' ' s d , , , u " l ' ' ' t " c c c s h j j j t - - - t i i i p t t m s e e a : m m g / s s e / 1 2 s d ' ' ' e ) ) ) s t - c l u s t e r / _ c a t / i n d i c e s ? v & s = i n d e x "
Notes and Best Practices# 1. Handling Custom Analyzers# If plugins such as kuromoji are not available on the destination, the following options are available:
Migrate without analyzers : Japanese search accuracy will decrease, but data is preservedInstall plugins : Pre-install necessary plugins on the destinationUse alternative analyzers : Substitute with the standard analyzer, etc.2. Adjusting Batch Size# # m # m i i S g L g m r a r a a r a l t g t l e e e _ _ d i d i o n o n c d c d u e u e m x m x e ( e ( n ' n ' t s t l s m s a : a : r l g u l u e s _ s _ e d e d o o l c s c a s m s r ' a ' g , l , e l r ' e ' d r d b e e a s b s t t a t c _ t _ h s c l m h a s a r i l s g z l i e e ' z ' , e , b b a a t t c c h h _ _ s s i i z z e e = = 5 5 0 0 0 0 0 ) )
3. Parallel Execution# You can reduce overall migration time by migrating multiple indices in parallel.
# p p p w y y y a R t t t i u h h h t n o o o n n n i n m m m i i i p g g g a r r r r a a a a t t t l e e e l . . . e p p p l y y y i i i i n t t m e e a t m m g h s s e e 1 2 s b c c c a j j j c - - - k i i i g t t m r e e a o m m g u s s e n 1 2 s d & & &
4. Error Handling# It is recommended to implement a mechanism that checks Bulk API responses and records/retries documents that encountered errors.
i f r e f s o u r l t i i . t f g e e m ' t e e e # l ( i r r r o ' n r r r R g e o o o e g r r r r r c i r e ' _ _ o n o s d r r g r u i o e d . s l n c a e ' t _ s i r ) [ i i o n r : ' t d n o i e e r t m = = r ( e . r f m g i i o " s e t t r F ' t e e a ] ( m m l i : ' [ [ o l i ' ' g e n i i d d n n : e d d x e e { ' x x e , ' ' r ] ] r { [ [ o } ' ' r ) _ e _ : i r d d r o ' o c ] r _ ' i ] d [ } ' r - e a { s e o r n r ' o ] r _ r e a s o n } " )
5. Verifying Document Counts# After migration, compare the document counts between the source and destination to verify data consistency.
# c # c u u S r D r o l e l u s r - t - c s i s e n - a - d u t u o i c " o " u u n u m s s e e d e n r o r t : c : p u p c a m a o s e s u s n s n w t w t o o r c r d o d " u " n " t " h h t t t t p p s s : : / / / / s d o e u s r t c / e _ / c _ o c u o n u t n ? t q ? = q * = " * "
Migration speed varies depending on the environment and document size, but generally the following can be expected:
Small to medium documents : 500-1,000 docs/secLarge documents : 100-500 docs/secNetwork bandwidth : The distance and bandwidth between clusters significantly affects speedSummary# By combining the Scroll API and Bulk API, even large-scale data can be migrated reliably. It is important to proceed with migration while paying attention to environmental differences such as the availability of custom analyzers.
Reference Links#