NVDLA综合

使用TSMC28工艺库实现,全部资料来源于公开网络,仅供学习交流,侵权请联系删除。

RTL代码生成

NVDLA仓库clone下来之后checkout到nv_small分支,执行make:

git checkout nv_small
make

按提示配置,生成的tree.make如下:

##======================= 										  
## Project Name Setup, multiple projects supported			  	  
##======================= 										  
PROJECTS := nv_small
  																  
##======================= 										  
##Linux Environment Setup 										  
##======================= 										  
  																  
USE_DESIGNWARE  := 1
DESIGNWARE_DIR  := /iccad/synopsys/syn/U-2022.12-SP1/dw/sim_ver
CPP  := /usr/bin/cpp
GCC  := /usr/bin/gcc
CXX  := /usr/bin/g++
PERL := /usr/bin/perl
JAVA := /usr/bin/java
SYSTEMC := /usr/local/systemc-2.3.4
PYTHON := /usr/bin/python3.6
VCS_HOME := /iccad/synopsys/vcs/T-2022.06
NOVAS_HOME := /iccad/synopsys/verdi/T-2022.06
VERDI_HOME := /iccad/synopsys/verdi/T-2022.06
VERILATOR := verilator
CLANG := /home/utils/llvm-4.0.1/bin/clang

运行tmake生成RTL

./tools/bin/tmake -build vmod

可能会提示perl的模块没有安装,按提示使用CPAN安装即可。

SRAM替换

hw/outdir/nv_small/vmod/rams/synth/目录下可以找到所需的sram,可以看到全部只使用了一个clock port,即都是pseudo dual port sram。

nvdla的在sram外面套了两层壳,以nv_ram_rwsp_20x289.v为例,这是最外层的壳,包含一些mbsit相关的逻辑,其中例化nv_ram_rwsp_20x289_logic.v,这一层主要包含物理sram的拼接逻辑,例如物理sram只支持288的width,就会在nv_ram_rwsp_20x289_logic这一层例化20个reg,代替缺失的1bit。

而实际例化的物理sram在hw/outdir/nv_small/vmod/rams/model/目录下可以找到,可能是sram ip对于size有限制,虽然nvdla所有的ram wrapper都只使用了一个clk port(pesuado dual port sram),但却例化了一些dual port sram。

考虑到后续流程用不到这些mbist逻辑,并且跟官方使用的工艺库的size配置不同,因此直接替换掉nvdla的最外层sram wrapper,在上例中即直接替换掉nv_ram_rwsp_20x289,并且尽可能选用pesuado dual port sram来替换,如果size需求无法满足,再使用dual port sram。

TSMC28 MC2的库中没有pseudo dual port sram,因此用2prf代替,tsn28hpcp2prf_20120200_130atsn28hpcpuhddpsram_20120200_170atsn28hpcpdpsram_20120200_130a,第一个是2prf,后两个都是dpsram,它们分别支持的size(depth*width)如下:

SEG OptionMux (CM)Word Depth (W)Word Width (I/O) (N)
F216,24,32, 48,56…96, 112,120…160, 176,184…224, 240,248…288, 304,312…352, 368,376…416, 432,440…480, 496,504…5122,3…144
F432,48,64, 96,112…192, 224,240…320, 352,368…448, 480,496…576, 608,624…704, 736,752…832, 864,880…960, 992,1008…10242,3…72
F864,96,128, 192,224…384, 448,480…640, 704,736…896, 960,992…1152, 1216,1248…1408, 1472,1504…1664, 1728,1760…1920, 1984,2016…20482,3…36
S264, 80,88…192, 208,216…320, 336,344…448, 464,472…5122,3…144
S4128, 160,176…384, 416,432…640, 672,688…896, 928,944…10242,3…72
S8256, 320,352…768, 832,864…1280, 1344,1376…1792, 1856,1888…20482,3…36
tsn28hpcp2prf_20120200_130a size配置表
Segment OptionMux OptionWord DepthWord Width (I/O)
SEGCMWN
M432,48…409610,11…144
tsn28hpcpuhddpsram_20120200_170a size配置表
SEG OptionMuxWord DepthWord Width (I/O)
F432,48…10244,5…72
864,96…20484,5…36
16128,192…40964,5…18
M432,48…20484,5…72
864,96…40964,5…36
16128,192…81924,5…18
tsn28hpcpdpsram_20120200_130a size配置表

按照hw/outdir/nv_small/vmod/rams/synth/中的sram size,挑选合适的配置填写MC2需要的config.txt,部分sram需要拼接,也有depth没有合适的情况需要冗余一些word,#后的注释有说明,所有sram都采用tsn28hpcp2prf_20120200_130a生成,其config.txt内容如下:

128x18m2f
128x128m2f # 128x128 * 2 = 128x256
128x64m2f
16x128m2f  # 16x128 * 2 = 16x256
16x136m2f  # 16x136 * 2 = 16x272
16x64m2f
256x3m2f
256x128m2f # 256x128 * 4 = 256x512
256x64m2f
256x7m2f
32x16m2f
32x128m2f  # 32x128 * 2 = 32x256; 32x128 * 4 = 32x512; 32x128 * 6 = 32x768
32x136m2f  # 32x136 * 2 = 32x272; 32x136 * 4 = 32x544
512x128m2f # 512x128 * 2 = 512x256;
512x64m2f
64x10m2f
64x128m2f  # 64x128 * 8 = 64x1024
64x136m2f  # 64x136 * 8 = 64x1088
64x116m2f
64x18m2f
128x11m2f
128x6m2f
160x16m2f
160x144m2f # 160x144 * 3 + 160x82 = 160x514
160x82m2f
160x65m2f
24x128m2f  # 24x128 * 2 + 24x33 = 24*289 (use as 20x289)
248x144m2f # 248x144 * 3 + 248x82 = 248x514 (use as 245x514)
248x82m2f
256x11m2f
32x32m2f
64x144m2f  # 64x144 * 3 + 64x82 = 64x514 (use as 61x514)
64x82m2f
64x64m2f   # use as 61x64
64x65m2f   # use as 61x65
80x14m2s
80x16m2s
80x128m2s # 80x128 * 2 = 80x256
80x144m2s # 80x144 * 3 + 80x82 = 80x514
80x82m2s
80x65m2s
16x65m2f  # use as 8x65
256x8m2f
24x32m2f  # use as 19x32
24x4m2f   # use as 19x4
24x80m2f  # use as 19x80
64x84m2f  # 64x84 * 2 = 64x168, use as 60x168
64x21m2f  # use as 60x21
80x15m2s
80x72m2s
80x9m2s

运行mc2生成sram,脚本如下:

#!/bin/bash
./tsn28hpcp2prf_130a.pl -NonBWEB -file config.txt 2>&1 | tee "cfg.log"
for cfg_file in *.cfg; do
    base_name=$(basename "$cfg_file" .cfg)
    mkdir -p "$base_name"
    log_file="$base_name/${base_name}.log"
    mc2-eu -eu -c tsn28hpcp2prf_20120200_130a.mco -cfg $cfg_file -ui textual -v -p tsmceva -d $base_name 2>&1 | tee "$log_file"
done

实际上有一些sram没有用到,可以用一下gen_sram_inst.sh脚本抓一下sram的例化情况:

#!/bin/bash

# 使用方法:./find_sram_inst.sh <目标目录>
# 生成sram_inst.f文件,包含所有被例化的SRAM模块

nvdla_sram_synth_dir="$1"
prj_dir="$2"
output_file="sram_inst.f"

# 提取所有SRAM模块名
modules=$(ls $nvdla_sram_synth_dir | grep -E '_[0-9]+[xX][0-9]+\.v$' | sed 's/\.v$//')

# 初始化输出文件
> "$output_file"

for module in $modules; do
    # 在目标目录中搜索.v和.sv文件(排除指定类型)
    # 使用正则表达式匹配两种例化格式
    echo "Searching $module"
    found=$(find "$prj_dir" -type f \( -name "*.v" -o -name "*.sv" \) \
        ! -name "*logic.v" ! -name "*.vcp" ! -name "*.log" ! -name "*.f" \
        -exec grep -E -m1 "\b${module}\b(\s*#\s*\([^)]*\))?\s+\w+\s*\(" {} + 2>/dev/null)

    if [ -n "$found" ]; then
        echo "$found"
        echo "$module" >> "$output_file"
    fi
done

echo "SRAM instance search complete, saved to:$output_file"

使用示例:

# 输出 sram_inst.f
./gen_sram_inst.sh nvdla/hw/outdir/nv_small/vmod/rams/synth/ nvdla/hw/outdir/nv_small/ 

接下来生成sram_wrapper,包含sram ip的例化以及位拼接逻辑,脚本如下:

import os
import re
import math
import argparse

def parse_fcp_line(line):
    line = line.strip()
    if not line:
        return None
    parts = line.split('#', 1)
    wrapper_name = parts[0].strip()
    comment = parts[1].strip() if len(parts) > 1 else None

    # Parse wrapper's depth and width
    size_part = wrapper_name.split('_')[-1]
    depth, width = map(int, size_part.split('x'))

    # Parse comment to get base segments
    segments = []
    if comment:
        segment_pattern = re.compile(r'\s*(\d+)x(\d+)(?:\s*\*\s*(\d+))?')
        for part in comment.split('+'):
            part = part.strip()
            match = segment_pattern.match(part)
            if not match:
                raise ValueError(f"Invalid comment segment: {part}")
            seg_depth = int(match.group(1))
            seg_width = int(match.group(2))
            seg_count = int(match.group(3)) if match.group(3) else 1
            segments.append({
                'depth': seg_depth,
                'width': seg_width,
                'count': seg_count
            })
    else:
        segments.append({
            'depth': depth,
            'width': width,
            'count': 1
        })
    return {
        'wrapper_name': wrapper_name,
        'depth': depth,
        'width': width,
        'segments': segments
    }

def find_sram_module(base_dir, target_depth, target_width):
    target_size = f"{target_depth}x{target_width}"
    candidates = []
    for dir_name in os.listdir(base_dir):
        dir_path = os.path.join(base_dir, dir_name)
        if os.path.isdir(dir_path):
            matches = re.findall(r'(\d+)x(\d+)', dir_name)
            for d, w in matches:
                if f"{d}x{w}" == target_size:
                    candidates.append(dir_name)
                    break
    if not candidates:
        raise ValueError(f"No module found for {target_size}: {candidates}")
        return None
    elif len(candidates) > 1:
        raise ValueError(f"Multiple modules found for {target_size}: {candidates}")
    else:
        return candidates[0]

def generate_wrapper(wrapper_info, sram_dir, output_dir):
    wrapper_name = wrapper_info['wrapper_name']
    depth = wrapper_info['depth']
    width = wrapper_info['width']
    segments = wrapper_info['segments']

    address_width = math.ceil(math.log2(depth)) if depth > 0 else 0

    instances = []
    current_bit = 0
    for seg in segments:
        seg_depth = seg['depth']
        seg_width = seg['width']
        seg_count = seg['count']
        module_name = find_sram_module(sram_dir, seg_depth, seg_width)
        if not module_name:
            raise ValueError(f"No module found for {seg_depth}x{seg_width}")

        for _ in range(seg_count):
            start_bit = current_bit
            end_bit = current_bit + seg_width - 1
            current_bit += seg_width
            instances.append({
                'module_name': module_name,
                'start_bit': start_bit,
                'end_bit': end_bit,
                'width': seg_width,
                'instance_name': f"sram_{len(instances)}",
                'dout_wire': f"dout{len(instances)}"
            })

    total_width = sum(inst['width'] for inst in instances)
    if total_width != width:
        raise ValueError(f"Total width {total_width} != {width} for {wrapper_name}")

    sorted_instances = sorted(instances, key=lambda x: -x['start_bit'])

    verilog_code = []
    verilog_code.append(f"module {wrapper_name} (")
    verilog_code.append(f"   input clk,")
    verilog_code.append(f"   input [{address_width-1}:0] ra,")
    verilog_code.append(f"   input re,")
    verilog_code.append(f"   output [{width-1}:0] dout,")
    verilog_code.append(f"   input [{address_width-1}:0] wa,")
    verilog_code.append(f"   input we,")
    verilog_code.append(f"   input [{width-1}:0] di,")
    verilog_code.append(f"   input pwrbus_ram_pd")
    verilog_code.append(f");\n")

    for inst in instances:
        verilog_code.append(f"wire [{inst['width']-1}:0] {inst['dout_wire']};")

    for inst in instances:
        verilog_code.append(f"{inst['module_name']} {inst['instance_name']} (")
        verilog_code.append(f"   .AA(wa),")
        verilog_code.append(f"   .D(di[{inst['end_bit']}:{inst['start_bit']}]),")
        verilog_code.append(f"   .WEB(~we),")
        verilog_code.append(f"   .CLKW(clk),")
        verilog_code.append(f"   .AB(ra),")
        verilog_code.append(f"   .Q({inst['dout_wire']}),")
        verilog_code.append(f"   .REB(~re),")
        verilog_code.append(f"   .CLKR(clk)")
        verilog_code.append(f");\n")

    dout_parts = [inst['dout_wire'] for inst in sorted_instances]
    verilog_code.append(f"assign dout = {{ {', '.join(dout_parts)} }};")
    verilog_code.append("endmodule")

    output_file = os.path.join(output_dir, f"{wrapper_name}.v")
    with open(output_file, 'w') as f:
        f.write('\n'.join(verilog_code))

def main():
    parser = argparse.ArgumentParser(description='Generate SRAM wrappers.')
    parser.add_argument('--input', required=True, help='Path to sram_inst.fcp file')
    parser.add_argument('--sram_dir', required=True, help='Directory containing SRAM modules')
    parser.add_argument('--output_dir', required=True, help='Output directory for wrappers')
    args = parser.parse_args()

    with open(args.input, 'r') as f:
        lines = f.readlines()

    os.makedirs(args.output_dir, exist_ok=True)

    for line in lines:
        line = line.strip()
        if not line:
            continue
        try:
            wrapper_info = parse_fcp_line(line)
            if wrapper_info:
                generate_wrapper(wrapper_info, args.sram_dir, args.output_dir)
        except Exception as e:
            print(f"Error processing line '{line}': {e}")

if __name__ == '__main__':
    main()

使用示例:

python3 gen_sram_wrapper.py --input sram_inst.fcp --sram_dir /iccad/lib/sram/nvdla_sram/TSMCHOME/sram/Compiler/tsn28hpcp2prf_20120200_130a --output_dir .

--input: 需要输入一个sram_inst.fcp文件,由gen_sram_inst.sh脚本输出sram_inst.f,需要手动补充一下拼接信息,示例如下:

nv_ram_rws_128x18
nv_ram_rws_16x256 # 16x128 * 2
nv_ram_rws_16x272 # 16x136 * 2
nv_ram_rws_16x64
nv_ram_rws_256x3
nv_ram_rws_256x64
nv_ram_rws_256x7
nv_ram_rws_32x16
nv_ram_rws_64x10
nv_ram_rwsp_128x11
nv_ram_rwsp_128x6
nv_ram_rwsp_160x16
nv_ram_rwsp_160x65
nv_ram_rwsp_20x289 # 24x128 * 2 + 24x33
nv_ram_rwsp_245x514 # 248x144 * 3 + 248x82

--sram_dir:指定sram ip的位置,其目录结构如下:

❯ tree -d /iccad/lib/sram/nvdla_sram/TSMCHOME/sram/Compiler/tsn28hpcp2prf_20120200_130a
/iccad/lib/sram/nvdla_sram/TSMCHOME/sram/Compiler/tsn28hpcp2prf_20120200_130a
├── ts6n28hpcphvta128x11m2fbso
│   ├── DATASHEET
│   ├── DFT
│   │   ├── ATPG
│   │   └── MBIST
│   ├── GDSII
│   ├── LEF
│   ├── LOG
│   ├── NLDM
│   ├── SPICE
│   └── VERILOG
├── ts6n28hpcphvta128x128m2fbso
│   ├── DATASHEET
│   ├── DFT
│   │   ├── ATPG
│   │   └── MBIST
│   ├── GDSII
│   ├── LEF
│   ├── LOG
│   ├── NLDM
│   ├── SPICE
│   └── VERILOG

暂无评论

发送评论 编辑评论


				
|´・ω・)ノ
ヾ(≧∇≦*)ゝ
(☆ω☆)
(╯‵□′)╯︵┴─┴
 ̄﹃ ̄
(/ω\)
∠( ᐛ 」∠)_
(๑•̀ㅁ•́ฅ)
→_→
୧(๑•̀⌄•́๑)૭
٩(ˊᗜˋ*)و
(ノ°ο°)ノ
(´இ皿இ`)
⌇●﹏●⌇
(ฅ´ω`ฅ)
(╯°A°)╯︵○○○
φ( ̄∇ ̄o)
ヾ(´・ ・`。)ノ"
( ง ᵒ̌皿ᵒ̌)ง⁼³₌₃
(ó﹏ò。)
Σ(っ °Д °;)っ
( ,,´・ω・)ノ"(´っω・`。)
╮(╯▽╰)╭
o(*////▽////*)q
>﹏<
( ๑´•ω•) "(ㆆᴗㆆ)
😂
😀
😅
😊
🙂
🙃
😌
😍
😘
😜
😝
😏
😒
🙄
😳
😡
😔
😫
😱
😭
💩
👻
🙌
🖕
👍
👫
👬
👭
🌚
🌝
🙈
💊
😶
🙏
🍦
🍉
😣
Source: github.com/k4yt3x/flowerhd
颜文字
Emoji
小恐龙
花!
上一篇
下一篇