Posts on Jam Sylph's little universe https://jamsylph.top/en-us/posts/ Recent content in Posts on Jam Sylph's little universe Hugo -- gohugo.io en-us Sun, 01 Mar 2020 00:00:00 +0000 Evolution of CNN https://jamsylph.top/en-us/posts/evolution-of-cnn/ Sun, 01 Mar 2020 00:00:00 +0000 https://jamsylph.top/en-us/posts/evolution-of-cnn/ <blockquote> <h2 id="-update-log">📝 Update Log</h2> <ul> <li>2024-05-26: Added latest progress in vision models for 2023-2024, included Phase 6 architecture analysis</li> <li>2023-11-15: Added analysis of current CNN patterns</li> <li>2023-09-15: Expanded Phase 5 (2020-present) architecture introduction, added ConvNeXt analysis</li> <li>2020-03-01: Initial article publication</li> </ul></blockquote> <h1 id="evolution-of-cnn">Evolution of CNN</h1> <h2 id="phase-1-foundation-era-1998-2011">Phase 1: Foundation Era (1998-2011)</h2> <h3 id="lenet-5-1998">LeNet-5 (1998)</h3> <p><strong>Founder</strong>: Yann LeCun</p> <p><strong>Main Architecture</strong>:</p> <ul> <li>7-layer structure: 3 convolutional layers, 2 pooling layers, 2 fully connected layers</li> <li>Uses 5×5 convolution kernels</li> <li>Uses sigmoid/tanh activation functions</li> </ul> <p><strong>Breakthroughs</strong>:</p> Complete Analysis of YOLOv8 Decoding Process https://jamsylph.top/en-us/posts/complete-analysis-of-yolov8-decoding-process/ Mon, 01 Jan 0001 00:00:00 +0000 https://jamsylph.top/en-us/posts/complete-analysis-of-yolov8-decoding-process/ <h1 id="complete-analysis-of-yolov8-decoding-process">Complete Analysis of YOLOv8 Decoding Process</h1> <blockquote> <p>This article provides a detailed analysis of prediction decoding and post-processing mechanisms in the YOLOv8 object detection algorithm, including key components such as DFL (Distribution Focal Loss) decoding and Non-Maximum Suppression (NMS).</p></blockquote> <h2 id="table-of-contents">Table of Contents</h2> <ul> <li><a href="https://jamsylph.top/en-us/posts/complete-analysis-of-yolov8-decoding-process/#1-prediction-decoding-process-decode_predictions">1. Prediction Decoding Process (decode_predictions)</a> <ul> <li><a href="https://jamsylph.top/en-us/posts/complete-analysis-of-yolov8-decoding-process/#11-grid-point-generation">1.1 Grid Point Generation</a></li> <li><a href="https://jamsylph.top/en-us/posts/complete-analysis-of-yolov8-decoding-process/#12-feature-map-processing">1.2 Feature Map Processing</a></li> <li><a href="https://jamsylph.top/en-us/posts/complete-analysis-of-yolov8-decoding-process/#13-dfl-decoding-implementation">1.3 DFL Decoding Implementation</a></li> <li><a href="https://jamsylph.top/en-us/posts/complete-analysis-of-yolov8-decoding-process/#14-coordinate-transformation">1.4 Coordinate Transformation</a></li> </ul> </li> <li><a href="https://jamsylph.top/en-us/posts/complete-analysis-of-yolov8-decoding-process/#2-post-processing-workflow-post_process">2. Post-processing Workflow (post_process)</a> <ul> <li><a href="https://jamsylph.top/en-us/posts/complete-analysis-of-yolov8-decoding-process/#21-confidence-filtering">2.1 Confidence Filtering</a></li> <li><a href="https://jamsylph.top/en-us/posts/complete-analysis-of-yolov8-decoding-process/#22-coordinate-format-conversion">2.2 Coordinate Format Conversion</a></li> <li><a href="https://jamsylph.top/en-us/posts/complete-analysis-of-yolov8-decoding-process/#23-non-maximum-suppression">2.3 Non-Maximum Suppression</a></li> <li><a href="https://jamsylph.top/en-us/posts/complete-analysis-of-yolov8-decoding-process/#24-coordinate-scale-restoration">2.4 Coordinate Scale Restoration</a></li> </ul> </li> <li><a href="https://jamsylph.top/en-us/posts/complete-analysis-of-yolov8-decoding-process/#3-core-method-analysis">3. Core Method Analysis</a> <ul> <li><a href="https://jamsylph.top/en-us/posts/complete-analysis-of-yolov8-decoding-process/#31-dist2bbox-method">3.1 dist2bbox Method</a></li> <li><a href="https://jamsylph.top/en-us/posts/complete-analysis-of-yolov8-decoding-process/#32-scale_boxes-method">3.2 scale_boxes Method</a></li> </ul> </li> <li><a href="https://jamsylph.top/en-us/posts/complete-analysis-of-yolov8-decoding-process/#4-complete-code-reference">4. Complete Code Reference</a></li> </ul> <h2 id="1-prediction-decoding-process-decode_predictions">1. Prediction Decoding Process (decode_predictions)</h2> <p>YOLOv8 adopts an anchor-free design, where the prediction decoding process converts network outputs into standard bounding box format. The entire process can be divided into several key steps:</p> Deep Dive into YOLOv5 dataloaders.py https://jamsylph.top/en-us/posts/deep-dive-into-yolov5-dataloaders.py/ Mon, 01 Jan 0001 00:00:00 +0000 https://jamsylph.top/en-us/posts/deep-dive-into-yolov5-dataloaders.py/ <blockquote> <p>In YOLOv5 object detection tasks, when running the <a href="yolo5-code-in-depth.md">train.py code</a>, the <code>train_loader</code> calls the <code>create_dataloader</code> function, which internally creates an instance of the <code>LoadImagesAndLabels</code> class as the dataset. This <code>create_dataloader</code> function ultimately returns a DataLoader and the dataset:</p></blockquote> <h1 id="-dataloaderspy">-dataloaders.py</h1> <h1 id="working-principle-of-the-__getitem__-method-in-pytorch-datasets">Working Principle of the <code>__getitem__</code> Method in PyTorch Datasets</h1> <p>The <code>__getitem__</code> is a special method (magic method) in Python, used in YOLOv5&rsquo;s <code>LoadImagesAndLabels</code> class to access individual samples from the dataset. This method is called when you use a data loader or directly access the dataset through indexing.</p> Deep Dive into YOLOv5 Object Detection Code https://jamsylph.top/en-us/posts/deep-dive-into-yolov5-code/ Mon, 01 Jan 0001 00:00:00 +0000 https://jamsylph.top/en-us/posts/deep-dive-into-yolov5-code/ <h1 id="deep-dive-into-yolov5-object-detection-code">Deep Dive into YOLOv5 Object Detection Code</h1> <blockquote> <p>This article offers an in-depth analysis of YOLOv5&rsquo;s training process and data augmentation mechanisms, helping to organize and summarize the internal implementation details of the YOLOv5 object detection model.</p></blockquote> <hr> <h2 id="1-analysis-of-trainpy-file">1. Analysis of train.py File</h2> <h3 id="11-import-section">1.1 Import Section</h3> <div class="highlight"><pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">argparse</span> </span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">math</span> </span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">os</span> </span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">random</span> </span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">subprocess</span> </span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">sys</span> </span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">time</span> </span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">copy</span> <span class="kn">import</span> <span class="n">deepcopy</span> </span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">datetime</span> <span class="kn">import</span> <span class="n">datetime</span><span class="p">,</span> <span class="n">timedelta</span> </span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">pathlib</span> <span class="kn">import</span> <span class="n">Path</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="k">try</span><span class="p">:</span> </span></span><span class="line"><span class="cl"> <span class="kn">import</span> <span class="nn">comet_ml</span> <span class="c1"># must be imported before torch (if installed)</span> </span></span><span class="line"><span class="cl"><span class="k">except</span> <span class="ne">ImportError</span><span class="p">:</span> </span></span><span class="line"><span class="cl"> <span class="n">comet_ml</span> <span class="o">=</span> <span class="kc">None</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="nn">np</span> </span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">torch</span> </span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">torch.distributed</span> <span class="k">as</span> <span class="nn">dist</span> </span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">torch.nn</span> <span class="k">as</span> <span class="nn">nn</span> </span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">yaml</span> </span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">torch.optim</span> <span class="kn">import</span> <span class="n">lr_scheduler</span> </span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">tqdm</span> <span class="kn">import</span> <span class="n">tqdm</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="n">FILE</span> <span class="o">=</span> <span class="n">Path</span><span class="p">(</span><span class="vm">__file__</span><span class="p">)</span><span class="o">.</span><span class="n">resolve</span><span class="p">()</span> </span></span><span class="line"><span class="cl"><span class="n">ROOT</span> <span class="o">=</span> <span class="n">FILE</span><span class="o">.</span><span class="n">parents</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="c1"># YOLOv5 root directory</span> </span></span><span class="line"><span class="cl"><span class="k">if</span> <span class="nb">str</span><span class="p">(</span><span class="n">ROOT</span><span class="p">)</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">sys</span><span class="o">.</span><span class="n">path</span><span class="p">:</span> </span></span><span class="line"><span class="cl"> <span class="n">sys</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">ROOT</span><span class="p">))</span> <span class="c1"># add ROOT to PATH</span> </span></span><span class="line"><span class="cl"><span class="n">ROOT</span> <span class="o">=</span> <span class="n">Path</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">relpath</span><span class="p">(</span><span class="n">ROOT</span><span class="p">,</span> <span class="n">Path</span><span class="o">.</span><span class="n">cwd</span><span class="p">()))</span> <span class="c1"># relative</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">val</span> <span class="k">as</span> <span class="nn">validate</span> <span class="c1"># for end-of-epoch mAP</span> </span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">models.experimental</span> <span class="kn">import</span> <span class="n">attempt_load</span> </span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">models.yolo</span> <span class="kn">import</span> <span class="n">Model</span> </span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">utils.autoanchor</span> <span class="kn">import</span> <span class="n">check_anchors</span> </span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">utils.autobatch</span> <span class="kn">import</span> <span class="n">check_train_batch_size</span> </span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">utils.callbacks</span> <span class="kn">import</span> <span class="n">Callbacks</span> </span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">utils.dataloaders</span> <span class="kn">import</span> <span class="n">create_dataloader</span> </span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">utils.downloads</span> <span class="kn">import</span> <span class="n">attempt_download</span><span class="p">,</span> <span class="n">is_url</span> </span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">utils.general</span> <span class="kn">import</span> <span class="p">(</span> </span></span><span class="line"><span class="cl"> <span class="n">LOGGER</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">TQDM_BAR_FORMAT</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">check_amp</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">check_dataset</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">check_file</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">check_git_info</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">check_git_status</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">check_img_size</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">check_requirements</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">check_suffix</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">check_yaml</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">colorstr</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">get_latest_run</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">increment_path</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">init_seeds</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">intersect_dicts</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">labels_to_class_weights</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">labels_to_image_weights</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">methods</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">one_cycle</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">print_args</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">print_mutation</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">strip_optimizer</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">yaml_save</span><span class="p">,</span> </span></span><span class="line"><span class="cl"><span class="p">)</span> </span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">utils.loggers</span> <span class="kn">import</span> <span class="n">LOGGERS</span><span class="p">,</span> <span class="n">Loggers</span> </span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">utils.loggers.comet.comet_utils</span> <span class="kn">import</span> <span class="n">check_comet_resume</span> </span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">utils.loss</span> <span class="kn">import</span> <span class="n">ComputeLoss</span> </span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">utils.metrics</span> <span class="kn">import</span> <span class="n">fitness</span> </span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">utils.plots</span> <span class="kn">import</span> <span class="n">plot_evolve</span> </span></span><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">utils.torch_utils</span> <span class="kn">import</span> <span class="p">(</span> </span></span><span class="line"><span class="cl"> <span class="n">EarlyStopping</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">ModelEMA</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">de_parallel</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">select_device</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">smart_DDP</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">smart_optimizer</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">smart_resume</span><span class="p">,</span> </span></span><span class="line"><span class="cl"> <span class="n">torch_distributed_zero_first</span><span class="p">,</span> </span></span><span class="line"><span class="cl"><span class="p">)</span> </span></span><span class="line"><span class="cl"> </span></span><span class="line"><span class="cl"><span class="n">LOCAL_RANK</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">getenv</span><span class="p">(</span><span class="s2">&#34;LOCAL_RANK&#34;</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">))</span> </span></span><span class="line"><span class="cl"><span class="n">RANK</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">getenv</span><span class="p">(</span><span class="s2">&#34;RANK&#34;</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">))</span> </span></span><span class="line"><span class="cl"><span class="n">WORLD_SIZE</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">getenv</span><span class="p">(</span><span class="s2">&#34;WORLD_SIZE&#34;</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span> </span></span><span class="line"><span class="cl"><span class="n">GIT_INFO</span> <span class="o">=</span> <span class="n">check_git_info</span><span class="p">()</span> </span></span></code></pre></div><h3 id="12-detailed-explanation-of-the-train-function">1.2 Detailed Explanation of the Train() Function</h3> <p>The Train() function is the core function of YOLOv5 training, responsible for managing the entire training process:</p> Exploration of the Inception Architecture https://jamsylph.top/en-us/posts/exploration-of-the-inception-architecture/ Mon, 01 Jan 0001 00:00:00 +0000 https://jamsylph.top/en-us/posts/exploration-of-the-inception-architecture/ <blockquote> <h2 id="-update-log">📝 Update Log</h2> <ul> <li> <p>2024-05-18:</p> <ul> <li>Enhanced the &ldquo;Adaptive Feature Processing&rdquo; section, refining the three levels of adaptive mechanisms</li> <li>Added new &ldquo;Bottleneck Layer Design Pattern&rdquo; section, with in-depth analysis of the dimension reduction-processing-expansion design philosophy</li> </ul> </li> <li> <p>2023-10-20:</p> <ul> <li>Comprehensive revision of article structure, strengthening the universality of Inception principles</li> <li>Added discussion on Inception&rsquo;s long-term impact on modern network design</li> </ul> </li> <li> <p>2023-05-12:</p> <ul> <li>Expanded on Inception&rsquo;s influence in other networks</li> <li>Added application of Inception principles in segmentation networks</li> </ul> </li> <li> <p>2022-08-15:</p> introduction-to-yolo https://jamsylph.top/en-us/posts/introduction-to-yolo/ Mon, 01 Jan 0001 00:00:00 +0000 https://jamsylph.top/en-us/posts/introduction-to-yolo/ <h1 id="introduction-to-yolo">introduction-to-yolo</h1> <p>YOLO (You Only Look Once) is a popular real-time object detection algorithm known for its efficient performance and high accuracy. Unlike traditional object detection methods, YOLO treats object detection as a regression problem, directly predicting bounding boxes and class probabilities from the complete image.</p> <h2 id="basic-principles-of-yolo">Basic Principles of YOLO</h2> <p>The core idea of YOLO is to divide the entire image into an S×S grid, with each grid cell responsible for predicting objects contained within it. Specifically, each grid cell predicts:</p> Understanding Mamba through MambaStock https://jamsylph.top/en-us/posts/mamba-stock-implementation-and-insights/ Mon, 01 Jan 0001 00:00:00 +0000 https://jamsylph.top/en-us/posts/mamba-stock-implementation-and-insights/ <blockquote> <h2 id="-update-log">📝 Update Log</h2> <ul> <li>2024-06-16: <ul> <li>Summarized experiment results and future improvement directions</li> <li>Added comprehensive comparison analysis with other models</li> <li>Enhanced technical insights and final conclusions</li> </ul> </li> <li>2024-06-15: <ul> <li>Created MambaStock paper implementation documentation</li> <li>Detailed analysis of Mamba model architecture</li> <li>Implemented MambaStock design improvements</li> </ul> </li> </ul></blockquote> <h1 id="understanding-mamba-through-mambastock">Understanding Mamba through MambaStock</h1> <h2 id="1-review-of-mamba-model-architecture">1. Review of Mamba Model Architecture</h2> <p>According to the paper, the MambaStock structure is an improvement based on the Mamba model, so let&rsquo;s first examine the main architecture of the Mamba model (mamba.py file).</p>