Overview# I had the opportunity to train YOLOv11 classification (kuzushiji recognition) using mdx.jp, so here are my notes.
Dataset# The following “Kuzushiji Dataset” is used as the target.
http://codh.rois.ac.jp/char-shape/book/
Dataset Creation# The dataset is formatted to match the YOLO format. First, data separated by book title is flattened and merged.
# c | l a e s x s d p e o C f r l t a c # f # f s r i o s e " l o r i a . e u f t . s t f c o i # o s i e / p i l u f s h c _ d = u l s t p . u a d a t e p o r m t t a t g _ = u s c i a i i t a l d i t . o n k l o a / o i n f _ p n t e . n s * b r i f a t ( d c : e / ( t l i t i f i o t c i = q e l h n " r p ( h n d . e . u C s y s a p " m s e e o ( ( e r u . ( p = x p f f l a t . f l i y " i f c _ / i i f s i { l , t f d l t " t n o e e i a e ( { s g u , i r l t s " o ( t n s e a ) / u o { p o p / _ / : " t u f u u u * p d ) p t i t t t / a a [ u p l _ p _ * t t - t u e d u f . h a 2 _ t } i t i j ) s ] d _ r _ l p e i f t } f e g t r i o / i _ " " } l { l p / e { c e a { ) o l ) t c : u s h l t } , s p " } u , o / t u { _ e t f f x p i i i u l l s t e e t _ . } _ d s " o i p ) k r l = ) i T : t r ( u ' e / ) ' ) [ - 1 ] } "
Next, the dataset is split using the following script.
d e f s i # c # f p p f l o r l G a S r i i o e s p n t s t s l c c f # r t v t v t # f t ( . s e i l l i a r a r a e o ( s p h c s t s a l S n a l a l s C r " e a u l s e h d i _ i _ t r D l t t a = a i s s u o n e n f _ e s s o # f a f h i s n n _ f m _ n _ i f a p p s o t , . l s [ d d = f . e d f l i t l l . C r a e . d t i l s n i e l e i i m o i x r d s q r [ e h d = l s e t t a p f s s n i m i f a d o u e s s , _ k y i h p p s t r o v m = s a f = i s = a d e l u l u t r e r e ( . n f n = v s i d f e t i t s e c c o p d l i t = f e p r i i i t _ ( e t d d l s a e n ( i f l r l i l t d o ( o a a . t s ( t l f l i d i = s e n . i i u o r i t s p h p f ( e i e l i t ( s c n r t u i n a s a . l i l n l s e r _ o s s o g , p t e e t j i l e ( e [ s e f s p p p u p s o s h o t e n f s t [ c i . l l y c o t u s ) . i s ( i [ r v t l p i i ( o u _ t . : j n ) f l : a a o e a t t f m t d _ l o ( i e t i l r s t _ _ i p p i d i i c l s r n _ i h d f l l u r i s n l e ) a _ e e i . i i e e t ) r t ( a s i e n s n j r l , t _ : ) d i s ) n n d o , e e d i n s _ d : z i s o d i r p _ ( e : ] i n e : s . r ( u d t n v p ( x . " , i t i t r d a ( o i p ) n _ r r a ] l [ u s a t p d , a i _ " t t t r u i i n e t p _ h a t r f n _ n r u o . i _ , ) _ r d a t k j n d r a ] i _ = o _ i c f a t n d T i r r l o t i " i r n a ) s r i o , r u ( t ) o , e s i i f ) " ) p o f v s l i v a p i = o n a l l t s l " i _ 0 . o _ , t d . p s r , i 7 a . a " r , t l t t c , h i i e l v . s o s s o a i t ) t ) s l s d ) " . _ d i ] p r i r , a a r ( t t ( c [ h i o l t . o s a r b . s a a = p s i s a _ n e 0 t d _ n . h i f a 1 . r i m 5 j ) l e ) o e ( : i i s f n f , i ( l i o v e n s a ) p . l ) u p _ ) t a f _ t i d h l i . e r i s , s , f d i t ) l e ) e s ] ( t o _ s f . i p l a e t s h ] . ) j : o i n ( c l a s s _ d i r , f ) ) ]
As a result, a dataset of 1,086,326 images was created.
Using Ultralytics HUB (Failed)# First, I considered using Ultralytics HUB and attempted to upload the dataset, but the following error occurred. I was unable to determine the cause, such as whether there was an error in the dataset creation method.
Uploading to mdx.jp Object Storage# The dataset is uploaded to object storage using the following command.
s 3 c m d s y n c d a t a / s p l i t _ d a t a s e t _ f u l l . z i p s 3 : / / s a t o r u 1 9 6 / d a t a s e t /
Reference: Initial Setup# The official manual is available at the following link.
https://docs.mdx.jp/ja/index.html#オブジェクトストレージの利用方法例
s u d o a p t i n s t a l l s 3 c m d
s A A S D U S U i D E P U H T P S N N S 3 c c e e s 3 s f N n a s T e l u o o a c c c c f e e S c t e T s e c w t v m e e r a E t - r h P t a c e d s s e u " n " h s y H s e v c s s t l s d % e t p t T P a e s e o s t 3 p ( y t o T r c s r n e - k K K . o b t l i P o c w . i f t c e e e R a i u a e o G S x e a f i t o y y y e m n c r n P y s i Y y g i n : : g a t k g b G p s t o i u n f a i z e e u p r s , u n r g i n ★ ★ o o [ t t c a p o e w r g e s g d ( ( n n s ) k s r t r i a d ? u E E a 3 s S e s o o v t t a t . r S n n [ w . . 3 t w g c e h t c h [ e e t t U s a s + o r o r e c a N y c e e S . m 3 s h r a l s m e t e / r r r ] c a . y o d m n u p s N e : o z a s s : [ a p t s e e ] t a s m o m t t [ Y m p i n r c e ★ " n a e n / e e l n k c ★ k c c u a z m a u s : i g e r m y e e r s f w o m s ] e y y i y s e - o s n s e r : d t p n s t e r . a u : / o a t d a a c w p p b c n i . r k k s S o s p o i r l d o e e e t 3 m . o r n e i n y y - ] c r t / d s s y ) ) 1 E : o t g e t e w o n m s t p n c o u d ★ " e g t a r r r p s d m ] i l e k o 3 t n p : a l t s i i d o s l l . d n s a s b k . e t . t b t ? u e . n m h a e c y t a d e s [ k i n x e f Y e w f d . t d o / t o i j a r n s r e n p r b ] . k r o g u a . e s t e c c . d t k c f m e e f o o A t s i r d m s s n i a . i e A f z n m y o g : a n - z i a ) o t S n 3 b t . u < S o c = 3 " k = . t % e T h ( t h L e b i e u [ s a t c % v a k ( m e r e b e g t u a t e ) c n h t s k s e " e m A t i m a ) t e a n s ' m z d . s p o s t n " 3 O y % . K S ( a ! f 3 l m o . o a r c z a o u t n s i a i o w n n s g ) . s c t " o h m e v ] a : e r n s ★ v s c 3 v a d a n s r . i b m a e d b x l u . e s j s e p . d Upload
s 3 c m d p u t ( f i l e n a m e ) s 3 : / / ( b u c k e t - n a m e )
Download
s 3 c m d g e t s 3 : / / ( b u c k e t - n a m e ) / ( o b j e c t - k e y - n a m e )
Operations on mdx.jp# A 1 GPU pack on mdx.jp is used.
The dataset is downloaded from object storage using the following command.
s u 3 n c z m i d p g s e p t l i s t 3 _ : d / a / t s a a s t e o t r _ u f 1 u 9 l 6 l / . d z a i t p a s e t / s p l i t _ d a t a s e t _ f u l l . z i p
Then, training is executed using the following script.
f # m # m r o o o L d D d m o e a e a l t l d e # b u d a . a p a l = s t t o i t t Y e r a c m c r O Y t a = h g h a L O i ' s s = l O L a n / = z 2 y v O n ( h 1 = 5 t 8 ( d o 0 2 6 i ' m , 2 c c y t e 4 s l o r / , a l a m i s o i d m s 1 n x p i 1 i u o f x n s r i - g e t c c r a l c / Y t s o y O i . n o L o p f l O n t i o ' g / m ) u s o r p d a l # e # t i # l i t N # N o _ u B a n d m I a n a b n t o t e p c - a r u h s s t i e o s z t f i i e _ m z f e a e c u p g l l o e a l c o s ' h s p s , s i t i z i f e o i # n c a a D l t a ) i t o a n s e m t o d p e a l t h
To continue training even if the SSH connection is lost, tmux is used.
t p m y u t x h o n n e w y o - u s r _ m t y r _ a t i r n a i i n n g i _ n s g c _ r s i e p s t s . i p o y n
After execution, the following checks are run.
d e f s i # c # f p p f l o r l G a S r i i o e s p n t s t s l c c f # r t v t v t # f t ( . s e i l l i a r a r a e o ( s p h c s t s a l S n a l a l s C r " e a u l s e h d i _ i _ t r D l t t a = a i s s u o n e n f _ e s s o # f a f h i s n n _ f m _ n _ i f a p p s o t , . l s [ d d = f . e d f l i t l l . C r a e . d t i l s n i e l e i i m o i x r d s q r [ e h d = l s e t t a p f s s n i m i f a d o u e s s , _ k y i h p p s t r o v m = s a f = i s = a d e l u l u t r e r e ( . n f n = v s i d f e t i t s e c c o p d l i t = f e p r i i i t _ ( e t d d l s a e n ( i f l r l i l t d o ( o a a . t s ( t l f l i d i = s e n . i i u o r i t s p h p f ( e i e l i t ( s c n r t u i n a s a . l i l n l s e r _ o s s o g , p t e e t j i l e ( e [ s e f s p p p u p s o s h o t e n f s t [ c i . l l y c o t u s ) . i s ( i [ r v t l p i i ( o u _ t . : j n ) f l : a a o e a t t f m t d _ l o ( i e t i l r s t _ _ i p p i d i i c l s r n _ i h d f l l u r i s n l e ) a _ e e i . i i e e t ) r t ( a s i e n s n j r l , t _ : ) d i s ) n n d o , e e d i n s _ d : z i s o d i r p _ ( e : ] i n e : s . r ( u d t n v p ( x . " , i t i t r d a ( o i p ) n _ r r a ] l [ u s a t p d , a i _ " t t t r u i i n e t p _ h a t r f n _ n r u o . i _ , ) _ r d a t k j n d r a ] i _ = o _ i c f a t n d T i r r l o t i " i r n a ) s r i o , r u ( t ) o , e s i i f ) " ) p o f v s l i v a p i = o n a l l t s l " i _ 0 . o _ , t d . p s r , i 7 a . a " r , t l t t c , h i i e l v . s o s s o a i t ) t ) s l s d ) " . _ d i ] p r i r , a a r ( t t ( c [ h i o l t . o s a r b . s a a = p s i s a _ n e 0 t d _ n . h i f a 1 . r i m 5 j ) l e ) o e ( : i i s f n f , i ( l i o v e n s a ) p . l ) u p _ ) t a f _ t i d h l i . e r i s , s , f d i t ) l e ) e s ] ( t o _ s f . i p l a e t s h ] . ) j : o i n ( c l a s s _ d i r , f ) ) ]
0
You can check the training progress by referencing the URL from “Ultralytics HUB: View model at”.
Batch Size# Initially, the batch size was left unspecified. The default appears to be 16, but as shown below, GPU memory usage was low.
d e f s i # c # f p p f l o r l G a S r i i o e s p n t s t s l c c f # r t v t v t # f t ( . s e i l l i a r a r a e o ( s p h c s t s a l S n a l a l s C r " e a u l s e h d i _ i _ t r D l t t a = a i s s u o n e n f _ e s s o # f a f h i s n n _ f m _ n _ i f a p p s o t , . l s [ d d = f . e d f l i t l l . C r a e . d t i l s n i e l e i i m o i x r d s q r [ e h d = l s e t t a p f s s n i m i f a d o u e s s , _ k y i h p p s t r o v m = s a f = i s = a d e l u l u t r e r e ( . n f n = v s i d f e t i t s e c c o p d l i t = f e p r i i i t _ ( e t d d l s a e n ( i f l r l i l t d o ( o a a . t s ( t l f l i d i = s e n . i i u o r i t s p h p f ( e i e l i t ( s c n r t u i n a s a . l i l n l s e r _ o s s o g , p t e e t j i l e ( e [ s e f s p p p u p s o s h o t e n f s t [ c i . l l y c o t u s ) . i s ( i [ r v t l p i i ( o u _ t . : j n ) f l : a a o e a t t f m t d _ l o ( i e t i l r s t _ _ i p p i d i i c l s r n _ i h d f l l u r i s n l e ) a _ e e i . i i e e t ) r t ( a s i e n s n j r l , t _ : ) d i s ) n n d o , e e d i n s _ d : z i s o d i r p _ ( e : ] i n e : s . r ( u d t n v p ( x . " , i t i t r d a ( o i p ) n _ r r a ] l [ u s a t p d , a i _ " t t t r u i i n e t p _ h a t r f n _ n r u o . i _ , ) _ r d a t k j n d r a ] i _ = o _ i c f a t n d T i r r l o t i " i r n a ) s r i o , r u ( t ) o , e s i i f ) " ) p o f v s l i v a p i = o n a l l t s l " i _ 0 . o _ , t d . p s r , i 7 a . a " r , t l t t c , h i i e l v . s o s s o a i t ) t ) s l s d ) " . _ d i ] p r i r , a a r ( t t ( c [ h i o l t . o s a r b . s a a = p s i s a _ n e 0 t d _ n . h i f a 1 . r i m 5 j ) l e ) o e ( : i i s f n f , i ( l i o v e n s a ) p . l ) u p _ ) t a f _ t i d h l i . e r i s , s , f d i t ) l e ) e s ] ( t o _ s f . i p l a e t s h ] . ) j : o i n ( c l a s s _ d i r , f ) ) ]
1
To take advantage of the A100’s 40GB memory, the batch size was changed to 256. As a result, the execution time per epoch was reduced as shown below.
d e f s i # c # f p p f l o r l G a S r i i o e s p n t s t s l c c f # r t v t v t # f t ( . s e i l l i a r a r a e o ( s p h c s t s a l S n a l a l s C r " e a u l s e h d i _ i _ t r D l t t a = a i s s u o n e n f _ e s s o # f a f h i s n n _ f m _ n _ i f a p p s o t , . l s [ d d = f . e d f l i t l l . C r a e . d t i l s n i e l e i i m o i x r d s q r [ e h d = l s e t t a p f s s n i m i f a d o u e s s , _ k y i h p p s t r o v m = s a f = i s = a d e l u l u t r e r e ( . n f n = v s i d f e t i t s e c c o p d l i t = f e p r i i i t _ ( e t d d l s a e n ( i f l r l i l t d o ( o a a . t s ( t l f l i d i = s e n . i i u o r i t s p h p f ( e i e l i t ( s c n r t u i n a s a . l i l n l s e r _ o s s o g , p t e e t j i l e ( e [ s e f s p p p u p s o s h o t e n f s t [ c i . l l y c o t u s ) . i s ( i [ r v t l p i i ( o u _ t . : j n ) f l : a a o e a t t f m t d _ l o ( i e t i l r s t _ _ i p p i d i i c l s r n _ i h d f l l u r i s n l e ) a _ e e i . i i e e t ) r t ( a s i e n s n j r l , t _ : ) d i s ) n n d o , e e d i n s _ d : z i s o d i r p _ ( e : ] i n e : s . r ( u d t n v p ( x . " , i t i t r d a ( o i p ) n _ r r a ] l [ u s a t p d , a i _ " t t t r u i i n e t p _ h a t r f n _ n r u o . i _ , ) _ r d a t k j n d r a ] i _ = o _ i c f a t n d T i r r l o t i " i r n a ) s r i o , r u ( t ) o , e s i i f ) " ) p o f v s l i v a p i = o n a l l t s l " i _ 0 . o _ , t d . p s r , i 7 a . a " r , t l t t c , h i i e l v . s o s s o a i t ) t ) s l s d ) " . _ d i ] p r i r , a a r ( t t ( c [ h i o l t . o s a r b . s a a = p s i s a _ n e 0 t d _ n . h i f a 1 . r i m 5 j ) l e ) o e ( : i i s f n f , i ( l i o v e n s a ) p . l ) u p _ ) t a f _ t i d h l i . e r i s , s , f d i t ) l e ) e s ] ( t o _ s f . i p l a e t s h ] . ) j : o i n ( c l a s s _ d i r , f ) ) ]
2
Reference: tmux# Display the list of current tmux sessions
d e f s i # c # f p p f l o r l G a S r i i o e s p n t s t s l c c f # r t v t v t # f t ( . s e i l l i a r a r a e o ( s p h c s t s a l S n a l a l s C r " e a u l s e h d i _ i _ t r D l t t a = a i s s u o n e n f _ e s s o # f a f h i s n n _ f m _ n _ i f a p p s o t , . l s [ d d = f . e d f l i t l l . C r a e . d t i l s n i e l e i i m o i x r d s q r [ e h d = l s e t t a p f s s n i m i f a d o u e s s , _ k y i h p p s t r o v m = s a f = i s = a d e l u l u t r e r e ( . n f n = v s i d f e t i t s e c c o p d l i t = f e p r i i i t _ ( e t d d l s a e n ( i f l r l i l t d o ( o a a . t s ( t l f l i d i = s e n . i i u o r i t s p h p f ( e i e l i t ( s c n r t u i n a s a . l i l n l s e r _ o s s o g , p t e e t j i l e ( e [ s e f s p p p u p s o s h o t e n f s t [ c i . l l y c o t u s ) . i s ( i [ r v t l p i i ( o u _ t . : j n ) f l : a a o e a t t f m t d _ l o ( i e t i l r s t _ _ i p p i d i i c l s r n _ i h d f l l u r i s n l e ) a _ e e i . i i e e t ) r t ( a s i e n s n j r l , t _ : ) d i s ) n n d o , e e d i n s _ d : z i s o d i r p _ ( e : ] i n e : s . r ( u d t n v p ( x . " , i t i t r d a ( o i p ) n _ r r a ] l [ u s a t p d , a i _ " t t t r u i i n e t p _ h a t r f n _ n r u o . i _ , ) _ r d a t k j n d r a ] i _ = o _ i c f a t n d T i r r l o t i " i r n a ) s r i o , r u ( t ) o , e s i i f ) " ) p o f v s l i v a p i = o n a l l t s l " i _ 0 . o _ , t d . p s r , i 7 a . a " r , t l t t c , h i i e l v . s o s s o a i t ) t ) s l s d ) " . _ d i ] p r i r , a a r ( t t ( c [ h i o l t . o s a r b . s a a = p s i s a _ n e 0 t d _ n . h i f a 1 . r i m 5 j ) l e ) o e ( : i i s f n f , i ( l i o v e n s a ) p . l ) u p _ ) t a f _ t i d h l i . e r i s , s , f d i t ) l e ) e s ] ( t o _ s f . i p l a e t s h ] . ) j : o i n ( c l a s s _ d i r , f ) ) ]
3
Reattach to a specific session
d e f s i # c # f p p f l o r l G a S r i i o e s p n t s t s l c c f # r t v t v t # f t ( . s e i l l i a r a r a e o ( s p h c s t s a l S n a l a l s C r " e a u l s e h d i _ i _ t r D l t t a = a i s s u o n e n f _ e s s o # f a f h i s n n _ f m _ n _ i f a p p s o t , . l s [ d d = f . e d f l i t l l . C r a e . d t i l s n i e l e i i m o i x r d s q r [ e h d = l s e t t a p f s s n i m i f a d o u e s s , _ k y i h p p s t r o v m = s a f = i s = a d e l u l u t r e r e ( . n f n = v s i d f e t i t s e c c o p d l i t = f e p r i i i t _ ( e t d d l s a e n ( i f l r l i l t d o ( o a a . t s ( t l f l i d i = s e n . i i u o r i t s p h p f ( e i e l i t ( s c n r t u i n a s a . l i l n l s e r _ o s s o g , p t e e t j i l e ( e [ s e f s p p p u p s o s h o t e n f s t [ c i . l l y c o t u s ) . i s ( i [ r v t l p i i ( o u _ t . : j n ) f l : a a o e a t t f m t d _ l o ( i e t i l r s t _ _ i p p i d i i c l s r n _ i h d f l l u r i s n l e ) a _ e e i . i i e e t ) r t ( a s i e n s n j r l , t _ : ) d i s ) n n d o , e e d i n s _ d : z i s o d i r p _ ( e : ] i n e : s . r ( u d t n v p ( x . " , i t i t r d a ( o i p ) n _ r r a ] l [ u s a t p d , a i _ " t t t r u i i n e t p _ h a t r f n _ n r u o . i _ , ) _ r d a t k j n d r a ] i _ = o _ i c f a t n d T i r r l o t i " i r n a ) s r i o , r u ( t ) o , e s i i f ) " ) p o f v s l i v a p i = o n a l l t s l " i _ 0 . o _ , t d . p s r , i 7 a . a " r , t l t t c , h i i e l v . s o s s o a i t ) t ) s l s d ) " . _ d i ] p r i r , a a r ( t t ( c [ h i o l t . o s a r b . s a a = p s i s a _ n e 0 t d _ n . h i f a 1 . r i m 5 j ) l e ) o e ( : i i s f n f , i ( l i o v e n s a ) p . l ) u p _ ) t a f _ t i d h l i . e r i s , s , f d i t ) l e ) e s ] ( t o _ s f . i p l a e t s h ] . ) j : o i n ( c l a s s _ d i r , f ) ) ]
4
Detach from a tmux session
Press Ctrl + b, then press d.
Training Results# The trained model is available from the following Hugging Face Space.
https://huggingface.co/spaces/nakamura196/yolov11x-cls-codh-char
There is room for improvement in accuracy, but there were cases where it returned correct results, as shown below.
The character images above are from the following source.
https://mojiportal.nabunken.go.jp/
Summary# I hope this serves as a useful reference for training using YOLO, mdx.jp, and other tools.