AI로 레거시 코드 마이그레이션 자동화하기: LLM 기반 코드 변환 실전 가이드

CTO가 회의에서 폭탄을 떨어뜨렸어요. "AngularJS 20만 줄, React로 옮깁니다. 6개월 안에요." 팀 절반은 AngularJS를 만져본 적도 없고, 누군가가 "AI로 하면 안 되나요?"라고 말한 순간 모든 시선이 여러분한테 쏠려요.

지금 수천 개 회사에서 이런 일이 벌어지고 있어요. AngularJS, jQuery, Java 8, Python 2, COBOL, 오래된 PHP… 레거시 코드를 모더나이즈해야 해요. 전통적인 방법(수동 리라이트)은 몇 년이 걸리고 사기를 떨어뜨리죠. 새로운 방법(LLM에 던지고 기도하기)은 AI가 버그 하나 고칠 때마다 새 버그 세 개를 만들어내는 걸 보기 전까진 괜찮아 보여요.

진실은 그 사이 어딘가에 있어요. LLM은 코드 마이그레이션을 엄청나게 가속할 수 있지만, 제대로 된 파이프라인을 만들어야만 가능해요. 이 글에서 뭐가 되고, 뭐가 안 되고, AI 마이그레이션 프로젝트가 레거시 코드보다 더 큰 재앙이 되는 실수를 어떻게 피하는지 정리했어요.

코드 마이그레이션이 LLM에 딱 맞는 이유 (근데 생각보다 어려운 이유)

코드 마이그레이션은 LLM에 특히 잘 맞는 특성이 있어요:

패턴이 반복됨: 대부분의 마이그레이션은 같은 변환 패턴을 수백 개 파일에 적용하는 거예요. AngularJS 컨트롤러도 템플릿을 따르고, Java POJO도 템플릿을 따르죠. LLM은 패턴 인식과 적용에 강해요.
입출력이 명확함: "이전"(옛 프레임워크)과 "이후"(새 프레임워크)가 확실해요. 변환 규칙을 알 수 있어요.
검증 가능함: 글쓰기나 요약과 달리, 코드 마이그레이션은 컴파일되는지, 테스트가 통과하는지로 딱 검증할 수 있어요.

하지만 "그냥 ChatGPT에 넣으면 되지" 접근법이 놓치는 근본적인 문제가 있어요:

컨텍스트 윈도우 문제

실제 AngularJS 컨트롤러는 혼자 존재하지 않아요. 서비스를 임포트하고, 서비스는 다른 서비스를 임포트해요. 템플릿은 디렉티브를 참조하고, scope 상속 체인이 여러 파일에 걸쳐 있어요. 컨트롤러 파일만 본 LLM은 문법적으로는 맞지만 의미적으로 틀린 React 코드를 만들어요.

┌─────────────────────────────────────────────────────────────┐
│                    마이그레이션 빙산                          │
│                                                              │
│                     ┌──────────────┐                         │
│                     │  컨트롤러    │  ← LLM은 이것만 봄      │
│                     │  (파일 1개)  │                         │
│            ─────────┴──────────────┴─────────                │
│           /                                  \               │
│          /  ┌──────────┐ ┌──────────┐         \              │
│         /   │ 서비스   │ │ 템플릿   │          \             │
│        /    │ (12개)   │ │ (HTML 3) │           \            │
│       /     └──────────┘ └──────────┘            \           │
│      /  ┌──────────┐ ┌──────────┐ ┌──────────┐   \          │
│     /   │  Scope   │ │  라우트  │ │  전역    │    \         │
│    /    │  체인    │ │  설정    │ │  상태    │     \        │
│   /     └──────────┘ └──────────┘ └──────────┘      \       │
│  └───────────────────────────────────────────────────┘       │
│                                                              │
│         ← 정확한 마이그레이션엔 이 전부가 필요함             │
└─────────────────────────────────────────────────────────────┘

문법은 같아도 의미가 달라지는 문제

코드 마이그레이션은 단순 문법 변환이 아니에요. 패러다임 자체가 바뀌는 거예요. AngularJS는 $scope로 양방향 데이터 바인딩을 쓰고, React는 훅으로 단방향 데이터 흐름을 써요. 1:1로 대응되는 게 없거든요. $scope.name을 기계적으로 useState('name')로 바꾸면 컴파일은 되는데, 엣지 케이스에서 동작이 미묘하게 달라져요. 폼 업데이트 레이스 컨디션, 지연된 워처, 다이제스트 사이클 타이밍 같은 데서 터지죠.

AI 마이그레이션의 80/20 법칙

실전에서 LLM은 마이그레이션 작업의 약 80%를 잘 처리해요. 지루하고 반복적인 변환이죠. 나머지 20%(복잡한 비즈니스 로직, 프레임워크 고유 엣지 케이스, 횡단 관심사)는 사람의 판단이 필요해요. 파이프라인은 이 현실을 반영해서 설계해야 해요.

아키텍처: AST 기반 마이그레이션 파이프라인

"파일 복사해서 ChatGPT에 붙여넣기" 방식은 스케일이 안 돼요. 프로덕션 마이그레이션에서 실제로 돌아가는 구조는 이거예요:

┌─────────────────────────────────────────────────────────────┐
│                  AI 마이그레이션 파이프라인                   │
│                                                              │
│  ┌──────────┐    ┌──────────┐    ┌──────────┐               │
│  │ 1. 파싱  │───▶│ 2. 청킹  │───▶│ 3. 컨텍스│               │
│  │  (AST)   │    │ (분할)   │    │  트 보강 │               │
│  └──────────┘    └──────────┘    └──────────┘               │
│       │                               │                      │
│       ▼                               ▼                      │
│  ┌──────────┐    ┌──────────┐    ┌──────────┐               │
│  │ 6. 테스트│◀───│ 5. 후처리│◀───│ 4. LLM   │               │
│  │ (검증)   │    │          │    │  변환    │               │
│  └──────────┘    └──────────┘    └──────────┘               │
│       │                                                      │
│       ▼                                                      │
│  ┌──────────┐                                                │
│  │ 7. 사람  │    신뢰도 < 90%? → 사람 리뷰 플래그            │
│  │  리뷰    │                                                │
│  └──────────┘                                                │
└─────────────────────────────────────────────────────────────┘

각 단계를 하나씩 까볼게요.

Step 1: AST로 파싱

소스 코드를 날것 그대로 LLM에 넣지 마세요. 먼저 AST(추상 구문 트리)로 파싱해요. 이러면 텍스트로는 안 보이는 구조 정보가 생겨요.

// ts-morph로 TypeScript/JavaScript 마이그레이션
import { Project, SyntaxKind } from 'ts-morph';

interface MigrationUnit {
  filePath: string;
  type: 'component' | 'service' | 'directive' | 'filter' | 'config';
  className: string;
  dependencies: string[];
  templatePath?: string;
  sourceCode: string;
  ast: any;
  complexity: number;
}

function parseAngularModule(filePath: string): MigrationUnit[] {
  const project = new Project();
  const sourceFile = project.addSourceFileAtPath(filePath);
  const units: MigrationUnit[] = [];

  sourceFile.getDescendantsOfKind(SyntaxKind.CallExpression).forEach(call => {
    const expr = call.getExpression().getText();

    if (expr.includes('.component') || expr.includes('.controller')) {
      const args = call.getArguments();
      const name = args[0]?.getText().replace(/['"]/g, '');

      const deps = extractDependencies(call);
      const complexity = calculateComplexity(call);

      units.push({
        filePath,
        type: expr.includes('.component') ? 'component' : 'service',
        className: name,
        dependencies: deps,
        sourceCode: call.getFullText(),
        ast: call.getStructure(),
        complexity,
      });
    }
  });

  return units;
}

function calculateComplexity(node: any): number {
  let complexity = 1;
  node.getDescendantsOfKind(SyntaxKind.IfStatement).forEach(() => complexity++);
  node.getDescendantsOfKind(SyntaxKind.SwitchStatement).forEach(() => complexity++);
  node.getDescendantsOfKind(SyntaxKind.ForStatement).forEach(() => complexity++);
  node.getDescendantsOfKind(SyntaxKind.WhileStatement).forEach(() => complexity++);
  node.getDescendantsOfKind(SyntaxKind.ConditionalExpression).forEach(() => complexity++);
  return complexity;
}

왜 굳이 AST부터 돌리냐고요?

우선순위 결정: 복잡도 낮은 파일부터 마이그레이션 (LLM 성공률 높음)
지능적 분할: 임의 줄 수가 아닌 함수/클래스 경계에서 나눔
의존성 추적: 어떤 파일들이 같이 옮겨져야 하는지 파악
출력 검증: 입력과 출력의 AST 구조 비교로 구조적 퇴보 잡기

Step 2: 마이그레이션 유닛 단위로 청킹

파일 전체를 한 번에 옮기지 마세요. 논리적 단위로 옮겨요. 컴포넌트 하나, 서비스 하나, 유틸리티 함수 하나씩. LLM 호출마다 초점이 맞고 컨텍스트 윈도우 안에 들어오게요.

interface MigrationBatch {
  primary: MigrationUnit;
  dependencies: MigrationUnit[];
  templates: string[];
  totalTokens: number;
}

function createBatches(
  units: MigrationUnit[],
  maxTokens: number = 12000
): MigrationBatch[] {
  const sorted = units.sort((a, b) => a.complexity - b.complexity);

  return sorted.map(unit => {
    const deps = unit.dependencies
      .map(dep => units.find(u => u.className === dep))
      .filter(Boolean) as MigrationUnit[];

    const depContext = deps.map(d => extractTypeSignature(d));

    const totalTokens = estimateTokens(
      unit.sourceCode + depContext.join('\n')
    );

    return {
      primary: unit,
      dependencies: deps,
      templates: unit.templatePath
        ? [readFileSync(unit.templatePath, 'utf-8')]
        : [],
      totalTokens,
    };
  });
}

function extractTypeSignature(unit: MigrationUnit): string {
  return `// Dependency: ${unit.className}\n` +
    `// Type: ${unit.type}\n` +
    `// Public methods: ${extractPublicMethods(unit).join(', ')}`;
}

Step 3: 컨텍스트 보강

대부분의 AI 마이그레이션이 여기서 실패해요. LLM한테는 변환 대상 파일 외에 추가 컨텍스트가 필요해요:

interface MigrationContext {
  batch: MigrationBatch;

  targetFramework: {
    version: string;
    stateManagement: string; // 'zustand' | 'redux' | 'context'
    styling: string;        // 'tailwind' | 'css-modules' | 'styled-components'
    routing: string;        // 'react-router' | 'next'
  };

  conventions: {
    fileNaming: string;     // 'kebab-case' | 'PascalCase'
    exportStyle: string;    // 'named' | 'default'
    hookPrefix: string;     // 'use'
    testFramework: string;  // 'vitest' | 'jest'
  };

  // 이미 완료한 마이그레이션 예시 (few-shot 학습)
  examples: {
    before: string;
    after: string;
    explanation: string;
  }[];

  edgeCases: string[];
}

examples 필드가 핵심이에요. 대표 컴포넌트 3~5개를 수동으로 먼저 마이그레이션한 다음, 모든 LLM 호출에 few-shot 예시로 넣으세요. 모델이 여러분 프로젝트만의 컨벤션을 학습하니까 출력 품질이 확 올라요.

Step 4: LLM 변환

이제 실제 LLM 호출이에요. 프롬프트 구조가 결과를 좌우해요:

function buildMigrationPrompt(context: MigrationContext): string {
  return `You are a senior software engineer performing a code migration.

## Source Framework
- AngularJS 1.8 with JavaScript
- Two-way data binding via $scope
- Dependency injection via string annotations

## Target Framework
- React 19 with TypeScript
- State management: ${context.targetFramework.stateManagement}
- Styling: ${context.targetFramework.styling}
- Routing: ${context.targetFramework.routing}

## Project Conventions
- File naming: ${context.conventions.fileNaming}
- Export style: ${context.conventions.exportStyle}
- Test framework: ${context.conventions.testFramework}

## Migration Rules
1. Convert $scope properties to useState/useReducer hooks
2. Convert $scope.$watch to useEffect
3. Convert $scope.$on/$emit to custom hooks or context
4. Convert services to custom hooks or utility modules
5. Convert ng-repeat to .map() with proper keys
6. Convert ng-if/ng-show to conditional rendering
7. Convert $http calls to fetch/axios with proper error handling
8. Preserve ALL business logic exactly — do not simplify or optimize
9. Add TypeScript types for all props, state, and function signatures
10. Do NOT add comments like "// migrated from Angular" or "// TODO"

## Examples of Completed Migrations
${context.examples.map(ex => `
### Before (AngularJS):
\`\`\`javascript
${ex.before}
\`\`\`

### After (React + TypeScript):
\`\`\`typescript
${ex.after}
\`\`\`

### Key decisions: ${ex.explanation}
`).join('\n')}

## Known Edge Cases
${context.edgeCases.map(ec => `- ${ec}`).join('\n')}

## Dependencies Available
${context.batch.dependencies.map(d =>
  `- ${d.className} (${d.type}): Available as imported module`
).join('\n')}

## Source Code to Migrate
\`\`\`javascript
${context.batch.primary.sourceCode}
\`\`\`

${context.batch.templates.length > 0 ? `
## Associated Template
\`\`\`html
${context.batch.templates[0]}
\`\`\`
` : ''}

Migrate this code to React + TypeScript following all conventions above.
Output ONLY the migrated code, no explanations.`;
}

프롬프트 엔지니어링 핵심 포인트:

Rule 8이 제일 중요해요: "비즈니스 로직은 절대 바꾸지 마라". LLM은 마이그레이션하면서 코드를 "개선"하려 들거든요. 그거 원하는 거 아니에요. 마이그레이션과 리팩토링은 별개 PR이어야 해요.
Few-shot 예시: 프로젝트에서 실제 완료한 마이그레이션 2~3개를 넣으세요. 아무리 긴 지시문보다 효과적이에요.
의존성 컨텍스트: 구현 전체 대신 타입 시그니처만 넣어서 토큰 낭비 막기.
코멘트 금지 규칙: LLM은 "Angular에서 옮김" 같은 셀프 참조 코멘트를 달아서 코드를 더럽혀요.

Step 5: 후처리

LLM 출력을 그대로 믿지 마세요. 후처리해요:

async function postProcess(
  llmOutput: string,
  context: MigrationContext
): Promise<{
  code: string;
  confidence: number;
  issues: string[];
}> {
  const issues: string[] = [];
  let confidence = 100;

  try {
    const project = new Project({ useInMemoryFileSystem: true });
    const sourceFile = project.createSourceFile('output.tsx', llmOutput);

    // TypeScript 에러 확인
    const diagnostics = sourceFile.getPreEmitDiagnostics();
    if (diagnostics.length > 0) {
      confidence -= diagnostics.length * 10;
      issues.push(
        ...diagnostics.map(d => `TS Error: ${d.getMessageText()}`)
      );
    }

    // 임포트 검증
    const imports = sourceFile.getImportDeclarations();
    for (const imp of imports) {
      const moduleSpecifier = imp.getModuleSpecifierValue();
      if (!isValidImport(moduleSpecifier, context)) {
        confidence -= 15;
        issues.push(`미해결 임포트: ${moduleSpecifier}`);
      }
    }

    // LLM 환각 체크
    const text = sourceFile.getFullText();
    if (text.includes('// TODO') || text.includes('// FIXME')) {
      confidence -= 5;
      issues.push('LLM이 TODO/FIXME 코멘트를 추가함');
    }

    // 훅 규칙 검증
    const hookViolations = checkHookRules(sourceFile);
    if (hookViolations.length > 0) {
      confidence -= hookViolations.length * 20;
      issues.push(...hookViolations);
    }

    // 비즈니스 로직 보존 확인
    const originalFunctions = extractFunctionNames(
      context.batch.primary.sourceCode
    );
    const migratedFunctions = extractFunctionNames(llmOutput);
    const missing = originalFunctions.filter(
      f => !migratedFunctions.some(m => isSimilarName(f, m))
    );
    if (missing.length > 0) {
      confidence -= missing.length * 15;
      issues.push(`누락된 함수: ${missing.join(', ')}`);
    }

    return { code: llmOutput, confidence, issues };

  } catch (e) {
    return {
      code: llmOutput,
      confidence: 0,
      issues: [`파싱 에러: ${e.message}`],
    };
  }
}

Step 6: 자동화 테스트

가장 중요한 단계예요. 자동 검증 없이 커밋하면 안 돼요:

async function verifyMigration(
  originalPath: string,
  migratedPath: string,
  testSuite: string
): Promise<MigrationVerification> {
  const results: MigrationVerification = {
    compiles: false,
    testsPass: false,
    renderMatches: false,
    accessibilityPass: false,
    performanceRegression: false,
  };

  // 1. TypeScript 컴파일
  const compileResult = await exec(`npx tsc --noEmit ${migratedPath}`);
  results.compiles = compileResult.exitCode === 0;

  // 2. 기존 테스트 실행
  if (testSuite) {
    const testResult = await exec(`npx vitest run ${testSuite}`);
    results.testsPass = testResult.exitCode === 0;
  }

  // 3. 비주얼 리그레션 테스트 (선택이지만 가치 있음)
  results.renderMatches = await compareScreenshots(
    originalPath,
    migratedPath
  );

  return results;
}

Step 7: 신뢰도 기반 사람 리뷰

모든 파일에 같은 수준의 사람 검토가 필요한 건 아니에요:

function triageMigration(
  result: PostProcessResult,
  verification: MigrationVerification
): 'auto-merge' | 'quick-review' | 'deep-review' | 'manual-rewrite' {
  if (result.confidence >= 95 && verification.testsPass
      && verification.compiles) {
    return 'auto-merge';
  }

  if (result.confidence >= 80 && verification.compiles) {
    return 'quick-review';
  }

  if (result.confidence >= 50) {
    return 'deep-review';
  }

  return 'manual-rewrite';
}

실전에서 few-shot 예시가 잘 준비된 AngularJS→React 마이그레이션의 경우:

~60% 파일: 자동 머지 또는 빠른 리뷰
~25% 파일: 정밀 리뷰 (주로 복잡한 상태 관리)
~15% 파일: 수동 리라이트 (복잡한 $scope 상속, 다이제스트 사이클 핵)

실전 마이그레이션 패턴

구체적인 마이그레이션 패턴과 LLM 처리 방법을 살펴봐요.

패턴 1: AngularJS → React

지금 가장 많은 엔터프라이즈 마이그레이션이에요.

// ❌ BEFORE: AngularJS 컨트롤러
angular.module('app').controller('UserListCtrl',
  ['$scope', '$http', 'UserService', 'NotificationService',
  function($scope, $http, UserService, NotificationService) {

    $scope.users = [];
    $scope.loading = true;
    $scope.searchTerm = '';
    $scope.selectedRole = 'all';

    $scope.loadUsers = function() {
      $scope.loading = true;
      UserService.getAll({ role: $scope.selectedRole })
        .then(function(users) {
          $scope.users = users;
          $scope.loading = false;
        })
        .catch(function(err) {
          NotificationService.error('Failed to load users');
          $scope.loading = false;
        });
    };

    $scope.filteredUsers = function() {
      if (!$scope.searchTerm) return $scope.users;
      return $scope.users.filter(function(user) {
        return user.name.toLowerCase()
          .includes($scope.searchTerm.toLowerCase());
      });
    };

    $scope.$watch('selectedRole', function(newVal, oldVal) {
      if (newVal !== oldVal) $scope.loadUsers();
    });

    $scope.loadUsers();
  }
]);

// ✅ AFTER: React + TypeScript
import { useState, useEffect, useMemo, useCallback } from 'react';
import { useUserService } from '@/hooks/useUserService';
import { useNotification } from '@/hooks/useNotification';

interface User {
  id: string;
  name: string;
  email: string;
  role: string;
}

type RoleFilter = 'all' | 'admin' | 'user' | 'moderator';

export function UserList() {
  const [users, setUsers] = useState<User[]>([]);
  const [loading, setLoading] = useState(true);
  const [searchTerm, setSearchTerm] = useState('');
  const [selectedRole, setSelectedRole] = useState<RoleFilter>('all');

  const userService = useUserService();
  const { showError } = useNotification();

  const loadUsers = useCallback(async () => {
    setLoading(true);
    try {
      const data = await userService.getAll({ role: selectedRole });
      setUsers(data);
    } catch {
      showError('Failed to load users');
    } finally {
      setLoading(false);
    }
  }, [selectedRole, userService, showError]);

  useEffect(() => {
    loadUsers();
  }, [loadUsers]);

  const filteredUsers = useMemo(() => {
    if (!searchTerm) return users;
    return users.filter(user =>
      user.name.toLowerCase().includes(searchTerm.toLowerCase())
    );
  }, [users, searchTerm]);

  if (loading) return <LoadingSpinner />;

  return (
    <div>
      <SearchInput value={searchTerm} onChange={setSearchTerm} />
      <RoleFilter value={selectedRole} onChange={setSelectedRole} />
      <UserTable users={filteredUsers} />
    </div>
  );
}

LLM이 잘하는 부분: 상태 매핑, 기본 훅 변환, 이펙트 의존성.
LLM이 틀리는 부분: useCallback 의존성 배열 (빠진 deps 많음), useMemo 최적화 경계, 프로젝트 고유 알림 시스템에 맞는 에러 핸들링 패턴.

패턴 2: Java 8 → Kotlin

// ❌ BEFORE: Java 8
public class OrderProcessor {
    private final OrderRepository orderRepo;
    private final PaymentService paymentService;
    private final NotificationService notificationService;

    public OrderProcessor(OrderRepository orderRepo,
                          PaymentService paymentService,
                          NotificationService notificationService) {
        this.orderRepo = orderRepo;
        this.paymentService = paymentService;
        this.notificationService = notificationService;
    }

    public OrderResult processOrder(OrderRequest request) {
        if (request == null || request.getItems() == null
            || request.getItems().isEmpty()) {
            throw new IllegalArgumentException("Invalid order");
        }

        BigDecimal total = request.getItems().stream()
            .map(item -> item.getPrice()
                .multiply(BigDecimal.valueOf(item.getQuantity())))
            .reduce(BigDecimal.ZERO, BigDecimal::add);

        if (total.compareTo(BigDecimal.valueOf(10000)) > 0) {
            request.setDiscount(total.multiply(
                BigDecimal.valueOf(0.1)));
        }

        PaymentResult payment = paymentService.charge(
            request.getCustomerId(), total);

        if (!payment.isSuccessful()) {
            return OrderResult.failed(payment.getErrorMessage());
        }

        Order order = orderRepo.save(
            Order.from(request, payment.getTransactionId()));
        notificationService.sendConfirmation(
            request.getCustomerId(), order);

        return OrderResult.success(order);
    }
}

// ✅ AFTER: Kotlin
class OrderProcessor(
    private val orderRepo: OrderRepository,
    private val paymentService: PaymentService,
    private val notificationService: NotificationService,
) {
    fun processOrder(request: OrderRequest): OrderResult {
        require(request.items.isNotEmpty()) { "Invalid order" }

        val total = request.items.sumOf { item ->
            item.price * item.quantity.toBigDecimal()
        }

        if (total > 10_000.toBigDecimal()) {
            request.discount = total * 0.1.toBigDecimal()
        }

        val payment = paymentService.charge(request.customerId, total)

        if (!payment.isSuccessful) {
            return OrderResult.failed(payment.errorMessage)
        }

        val order = orderRepo.save(
            Order.from(request, payment.transactionId)
        )
        notificationService.sendConfirmation(request.customerId, order)

        return OrderResult.success(order)
    }
}

LLM이 잘하는 부분: Null safety, require 변환, 프로퍼티 접근 문법, 트레일링 컴마, 표현 단순화.
LLM이 틀리는 부분: BigDecimal 커스텀 연산자 오버로딩, Kotlin 고유 컬렉션 확장 (sumOf), Result나 sealed class를 쓴 관용적 에러 핸들링.

패턴 3: Python 2 → Python 3

# ❌ BEFORE: Python 2
class DataProcessor(object):
    def __init__(self, config):
        self.config = config
        self.logger = logging.getLogger(__name__)

    def process_batch(self, items):
        results = []
        for item in items:
            try:
                processed = self._transform(item)
                results.append(processed)
            except Exception, e:
                self.logger.error(
                    u"Failed to process item %s: %s"
                    % (item.get('id', 'unknown'), unicode(e))
                )
        return results

    def _transform(self, item):
        if isinstance(item, basestring):
            item = {'value': item}

        keys = item.keys()
        keys.sort()
        output = {}
        for key in keys:
            value = item[key]
            if isinstance(value, unicode):
                output[key] = value.encode('utf-8')
            elif isinstance(value, (int, long)):
                output[key] = float(value)
            else:
                output[key] = value

        return output

# ✅ AFTER: Python 3
class DataProcessor:
    def __init__(self, config):
        self.config = config
        self.logger = logging.getLogger(__name__)

    def process_batch(self, items):
        results = []
        for item in items:
            try:
                processed = self._transform(item)
                results.append(processed)
            except Exception as e:
                self.logger.error(
                    f"Failed to process item {item.get('id', 'unknown')}: {e}"
                )
        return results

    def _transform(self, item):
        if isinstance(item, str):
            item = {'value': item}

        output = {}
        for key in sorted(item.keys()):
            value = item[key]
            if isinstance(value, str):
                output[key] = value
            elif isinstance(value, int):
                output[key] = float(value)
            else:
                output[key] = value

        return output

LLM이 잘하는 부분: except Exception as e 문법, f-string, unicode/basestring/long 제거, (object) 상속 제거.
LLM이 틀리는 부분: 미묘한 동작 변경이에요. Python 2의 dict.keys()는 리스트(변경 가능)를 반환하지만 Python 3은 뷰를 반환해요. LLM이 sorted()로 감싸는 건 맞췄지만, 반복 중 키 리스트를 수정하는 케이스는 놓쳐요.

코드 마이그레이션용 프롬프트 엔지니어링 플레이북

수천 번의 마이그레이션 변환을 돌려본 결과, 이 패턴들이 가장 좋은 결과를 내요:

1. 시스템 메시지: 마이그레이션 페르소나

You are a staff-level software engineer performing a code migration.
Your output will be committed directly to a production repository.

CRITICAL RULES:
- Preserve ALL business logic exactly as-is. Do NOT refactor, optimize,
  or "improve" the code during migration.
- If you are unsure about a conversion, mark it with
  __MIGRATION_REVIEW__ in a comment.
- Do NOT add explanatory comments about the migration itself.
- Do NOT change variable names unless required by target language
  conventions.
- Output ONLY the migrated code. No markdown fences, no explanations.

2. Few-Shot 예시가 긴 지시문을 이김

AngularJS를 React로 바꾸는 규칙 50개를 쓰는 대신, 실제 마이그레이션 예시 3개를 넣으세요. 모델은 추상적 규칙보다 구체적 예시에서 패턴을 훨씬 잘 배워요.

3. "보존하지, 개선하지 마라" 원칙

가장 중요한 규칙이에요. LLM은 이런 걸 하려 들어요:

없던 에러 핸들링 추가
알고리즘 최적화
변수명을 "더 명확하게" 변경
너무 빡빡하거나 느슨한 TypeScript 타입 추가

이런 변경 하나하나가 리스크예요. 마이그레이션과 개선은 별개 PR로 분리하세요.

4. 신뢰도 마커

LLM이 확신 없는 부분은 직접 마킹하게 하세요:

If any conversion is ambiguous (e.g., unclear scope inheritance,
non-obvious side effects), add this exact comment:
// __MIGRATION_REVIEW__: [reason for uncertainty]

This will be caught by our post-processing pipeline and flagged
for human review.

"알아서 잘 해줬겠지"보다 "여기 자신 없어요"라고 마킹해주는 게 100배 유용해요.

프로덕션 가드레일: 뭐가 잘못될 수 있나

1. 환각된 임포트

LLM이 없는 패키지 임포트를 지어내요. 학습 데이터에서 import { useQueryClient } from '@tanstack/react-query'를 봤으니까 쓰는데, 우리 프로젝트는 SWR을 쓰거든요.

해결: 모든 임포트를 실제 package.json과 프로젝트 파일 구조로 검증하세요.

2. 조용한 동작 변경

제일 위험한 버그예요. 코드가 컴파일되고, 테스트도 통과하는데 (테스트가 얕으니까), 동작이 미묘하게 바뀌어있어요. 자주 보이는 예시:

이벤트 핸들러 타이밍 (Angular 다이제스트 사이클 vs React 상태 배칭)
Null/undefined 핸들링 차이
비동기 연산 순서

해결: 마이그레이션 시작 전에 통합 테스트에 투자하세요. 이전 코드베이스에 대해 작성한 다음, 새 코드에서도 통과하는지 확인.

3. 과잉 엔지니어링된 컴포넌트

LLM이 간단한 AngularJS 컨트롤러를 커스텀 훅 4개, 컨텍스트 프로바이더 3개, 리듀서 하나 있는 React 컴포넌트로 바꿔요. 기술적으로 맞지만 유지보수가 안 돼요.

해결: 프롬프트에 추가하세요: "단순함을 우선. useState를 쓰고 상태 로직이 useReducer가 필요할 만큼 복잡할 때만 useReducer를 쓸 것. 한 컴포넌트에서만 쓰이는 로직은 커스텀 훅으로 빼지 말 것."

4. 복붙 폭발

비슷한 컴포넌트를 마이그레이션할 때, LLM이 공유 로직을 추출하지 않고 거의 똑같은 코드를 만들어요. 같은 데이터 페칭 패턴의 복사본이 20개 생겨요.

해결: 초기 마이그레이션 후에 공유 패턴을 훅과 유틸리티로 추출하는 2차 패스를 돌리세요. LLM이 마이그레이션된 파일 전체를 한 번에 볼 수 있어서 별도 단계로 하는 게 나아요.

마이그레이션 성공 측정

마이그레이션 기간 동안 이 지표들을 추적하세요:

지표	목표	측정 방법
자동 머지 비율	> 50%	모든 자동 검증을 통과한 파일
컴파일 성공률	> 90%	1차 TypeScript 컴파일
테스트 통과율	> 85%	기존 테스트 스위트 대비
비즈니스 로직 보존	100%	통합 테스트 스위트
일일 마이그레이션 라인	2,000+	파이프라인 튜닝 후
파일당 인적 리뷰 시간	< 15분	"빠른 리뷰" 티어 평균

마이그레이션 대시보드

모듈별 진행 상황을 추적하는 간단한 대시보드를 만드세요:

interface MigrationStatus {
  module: string;
  totalFiles: number;
  migrated: number;
  autoMerged: number;
  inReview: number;
  manualRewrite: number;
  avgConfidence: number;
  blockers: string[];
}

이 가시성이 이해관계자 소통에 꼭 필요해요. "200개 파일 중 147개를 92% 자동 머지율로 옮겼습니다"가 "잘 되고 있어요"보다 훨씬 설득력 있죠.

어떤 LLM을 쓸까

2026년 4월 기준, 마이그레이션 테스트 결과:

모델	잘하는 분야	한계
Claude 4 Sonnet	복잡한 로직 보존, TypeScript 정확도, 비즈니스 로직 충실 재현	200K 컨텍스트 윈도우가 대규모 다중 파일 배치엔 부족할 수 있음
GPT-5	다양한 언어 지원, 일관된 포맷, 강한 지시 이행력	마이그레이션 중 과잉 리팩토링 경향; 토큰당 비용 높음
Gemini 2.5 Pro	긴 컨텍스트 (1M 토큰), 다중 파일 이해, 대규모 적용 시 비용 효율	없는 API를 지어내는 경우 있음
DeepSeek V3	단순 변환의 비용 대비 효과, Python/Java 패턴에 강함	복잡한 비즈니스 로직과 크로스파일 의존성 정확도 낮음

추천: 초기 마이그레이션은 Claude 4 Sonnet이나 GPT-5로 (복잡한 로직 보존), 정리 패스는 단순 파일에 Gemini 2.5 Pro나 DeepSeek으로 하세요. 비용 차이가 10~50배이고, 단순 파일에 비싼 모델은 필요 없어요.

냉정한 현실: AI로 안 되는 것들

솔직히 인정할 건 인정해야 해요:

아키텍처 결정: 모놀리틱 AngularJS 앱을 마이크로 프론트엔드로 만들어야 할까? AI는 답 못 줘요.
상태 관리 설계: Zustand, Redux, Context 중 뭘 쓸까? LLM은 시키는 대로 하지만 아키텍처 결정은 못 해요.
성능 최적화: LLM은 여러분의 트래픽 패턴이나 병목을 몰라요.
비즈니스 규칙 검증: 원본 코드에 "의도된 동작"이라며 버그가 있으면 (다른 데서 보상하고 있으면), LLM은 그 버그를 충실하게 재현해요.
횡단 관심사: 로깅, 모니터링, 에러 추적, 피처 플래그. 새 아키텍처에서 사람이 설계해야 해요.

타임라인: 현실적인 마이그레이션 계획

200K LOC AngularJS→React 마이그레이션 기준:

단계	기간	내용
1. 셋업	2주	파이프라인 구축, LLM 설정, 수동 마이그레이션 예시 5~10개
2. 파일럿	2주	모듈 1개 (20~30개 파일) 엔드투엔드 마이그레이션, 프롬프트 튜닝
3. 스케일	8~10주	파이프라인이 나머지 모듈을 의존성 순서대로 처리
4. 다듬기	4주	엣지 케이스 수정, 공유 패턴 추출, 성능 튜닝
5. 검증	2주	전체 리그레션 테스트, 이해관계자 승인

총 4~5개월 vs 기존 수동 마이그레이션 12~18개월.

핵심 인사이트: 1, 2단계가 가장 중요해요. 파일럿 모듈에서 80% 이상 자동 머지율을 달성하면 나머지는 실행이에요. 파일럿이 실패하면 스케일하기 전에 접근 방식을 재고해야 해요.

마이그레이션 체크리스트

AI 기반 마이그레이션을 시작하기 전에:

준비

기존 테스트 스위트가 핵심 경로 70% 이상 커버
타겟 프레임워크 컨벤션 문서화
수동 레퍼런스 마이그레이션 5~10개 완료
소스 언어용 AST 파서 설정
CI 파이프라인에 컴파일/테스트 검증 포함

파이프라인

프롬프트 템플릿 20개 이상 대표 파일에서 테스트
후처리가 잘못된 임포트 잡음
신뢰도 스코어링 보정 (자동 머지 임계값 설정)
파일 타입별 few-shot 예시 포함
의존성 해결 순서 계산

실행

의존성 순서대로 마이그레이션 (리프 노드 먼저)
다음 배치 전 각 배치 검증
대시보드로 진행률과 신뢰도 추적
모듈별 사람 리뷰어 배정
모듈별 롤백 전략 정의

검증

마이그레이션 코드에 인테그레이션 테스트 통과
비주얼 리그레션 테스트 완료
성능 벤치마크 유지 또는 개선
인증/데이터 처리 경로 보안 리뷰
프로덕트 팀 마이그레이션 기능 승인

AI 코드 마이그레이션은 마법이 아니에요. 결국 엔지니어링이에요. LLM이 잘하는 거(패턴 변환)는 최대한 쥐어짜고, 못하는 거(정확성 보장)엔 가드레일을 빡빡하게 치는 파이프라인을 만들어야 해요. 잘 만들면 1년짜리 마이그레이션이 한 분기로 줄어요. 잘못 만들면 그 분기를 AI가 왜 인증 로직을 맘대로 뜯어고쳤는지 디버깅하는 데 날리게 되고요.

파이프라인부터 만드세요. 그리고 전부 검증하세요.