376 lines
10 KiB
Markdown
376 lines
10 KiB
Markdown
|
|
# 🎯 三个"补丁"更新日志
|
|||
|
|
|
|||
|
|
## 更新时间
|
|||
|
|
2026-04-05
|
|||
|
|
|
|||
|
|
## 更新说明
|
|||
|
|
|
|||
|
|
根据逍遥子的建议,为龙虾记忆同步系统添加了三个重要功能补丁,提升系统性能、可用性和安全性。
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 📦 补丁 1: 分块读取与流式传输
|
|||
|
|
|
|||
|
|
### 问题
|
|||
|
|
- 如果龙虾的记忆文件(比如某些 Log 或向量快照)超过 50MB
|
|||
|
|
- 一次性 GET /api/diff 会让后端内存瞬间飙升
|
|||
|
|
|
|||
|
|
### 解决方案
|
|||
|
|
- **流式读取**:使用 8KB 分块读取大文件,避免一次性加载到内存
|
|||
|
|
- **流式哈希计算**:直接从文件流计算哈希,无需加载完整内容
|
|||
|
|
- **差异对比限制**:大文件只显示头尾各 500 行,中间省略
|
|||
|
|
|
|||
|
|
### 实现细节
|
|||
|
|
```python
|
|||
|
|
# services.py
|
|||
|
|
class FileScanner:
|
|||
|
|
chunk_size = 8192 # 8KB 分块读取
|
|||
|
|
|
|||
|
|
def read_file_chunked(self, file_path: Path) -> str:
|
|||
|
|
"""分块读取文件"""
|
|||
|
|
content_parts = []
|
|||
|
|
with open(file_path, 'r', encoding='utf-8', errors='ignore') as f:
|
|||
|
|
while True:
|
|||
|
|
chunk = f.read(self.chunk_size)
|
|||
|
|
if not chunk:
|
|||
|
|
break
|
|||
|
|
content_parts.append(chunk)
|
|||
|
|
return ''.join(content_parts)
|
|||
|
|
|
|||
|
|
def read_file_stream(self, file_path: str) -> Iterator[str]:
|
|||
|
|
"""流式读取文件(用于大文件传输)"""
|
|||
|
|
with open(file_path, 'r', encoding='utf-8', errors='ignore') as f:
|
|||
|
|
while True:
|
|||
|
|
chunk = f.read(self.chunk_size)
|
|||
|
|
if not chunk:
|
|||
|
|
break
|
|||
|
|
yield chunk
|
|||
|
|
|
|||
|
|
def compute_hash_stream(self, file_path: Path) -> str:
|
|||
|
|
"""流式计算文件哈希(避免大文件内存问题)"""
|
|||
|
|
hash_obj = hashlib.sha256()
|
|||
|
|
with open(file_path, 'rb') as f:
|
|||
|
|
while True:
|
|||
|
|
chunk = f.read(self.chunk_size)
|
|||
|
|
if not chunk:
|
|||
|
|
break
|
|||
|
|
hash_obj.update(chunk)
|
|||
|
|
return hash_obj.hexdigest()
|
|||
|
|
|
|||
|
|
class DiffChecker:
|
|||
|
|
def get_file_diff(self, local_content: str, db_content: str, max_lines: int = 1000) -> Dict:
|
|||
|
|
"""获取文件差异(支持大文件限制)"""
|
|||
|
|
local_lines = local_content.split('\n')
|
|||
|
|
db_lines = db_content.split('\n')
|
|||
|
|
|
|||
|
|
# 限制行数(大文件只显示头尾)
|
|||
|
|
if len(local_lines) > max_lines:
|
|||
|
|
local_head = local_lines[:max_lines//2]
|
|||
|
|
local_tail = local_lines[-max_lines//2:]
|
|||
|
|
local_lines = local_head + ['... (中间省略 {}) 行 ...'.format(len(local_lines) - max_lines)] + local_tail
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### API 更新
|
|||
|
|
```http
|
|||
|
|
# 获取文件差异(支持分块读取)
|
|||
|
|
GET /api/diff/?lobster_id=daotong&file_path=large-file.log&chunked=true
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 📦 补丁 2: .lobsterignore 机制
|
|||
|
|
|
|||
|
|
### 问题
|
|||
|
|
- 临时文件(如 .DS_Store、日志缓存)不需要进数据库
|
|||
|
|
- 手动维护一个排除列表会更清爽
|
|||
|
|
|
|||
|
|
### 解决方案
|
|||
|
|
- 创建 `.lobsterignore` 文件(类似 `.gitignore`)
|
|||
|
|
- 扫描时自动跳过匹配的文件
|
|||
|
|
- 提供默认忽略规则
|
|||
|
|
|
|||
|
|
### 实现细节
|
|||
|
|
```python
|
|||
|
|
# services.py
|
|||
|
|
class IgnorePattern:
|
|||
|
|
""".lobsterignore 模式匹配器"""
|
|||
|
|
|
|||
|
|
def __init__(self, base_dir: Path):
|
|||
|
|
self.base_dir = base_dir
|
|||
|
|
self.patterns = []
|
|||
|
|
self.load_patterns()
|
|||
|
|
|
|||
|
|
def load_patterns(self):
|
|||
|
|
"""加载 .lobsterignore 文件"""
|
|||
|
|
ignore_file = self.base_dir / '.lobsterignore'
|
|||
|
|
|
|||
|
|
if ignore_file.exists():
|
|||
|
|
with open(ignore_file, 'r', encoding='utf-8') as f:
|
|||
|
|
for line in f:
|
|||
|
|
line = line.strip()
|
|||
|
|
# 跳过空行和注释
|
|||
|
|
if line and not line.startswith('#'):
|
|||
|
|
self.patterns.append(line)
|
|||
|
|
|
|||
|
|
# 添加默认忽略规则
|
|||
|
|
default_patterns = [
|
|||
|
|
'.DS_Store', '.git', '.gitignore', '__pycache__',
|
|||
|
|
'node_modules', '*.pyc', '*.pyo', '*.log',
|
|||
|
|
'*.tmp', '*.temp', '*.bak', '.vscode', '.idea'
|
|||
|
|
]
|
|||
|
|
for pattern in default_patterns:
|
|||
|
|
if pattern not in self.patterns:
|
|||
|
|
self.patterns.append(pattern)
|
|||
|
|
|
|||
|
|
def is_ignored(self, file_path: Path) -> bool:
|
|||
|
|
"""判断文件是否被忽略"""
|
|||
|
|
relative_path = file_path.relative_to(self.base_dir)
|
|||
|
|
|
|||
|
|
for pattern in self.patterns:
|
|||
|
|
# 匹配文件名
|
|||
|
|
if fnmatch.fnmatch(file_path.name, pattern):
|
|||
|
|
return True
|
|||
|
|
|
|||
|
|
# 匹配相对路径
|
|||
|
|
if fnmatch.fnmatch(str(relative_path), pattern):
|
|||
|
|
return True
|
|||
|
|
|
|||
|
|
# 匹配目录
|
|||
|
|
if pattern.endswith('/') and fnmatch.fnmatch(str(relative_path.parent), pattern.rstrip('/')):
|
|||
|
|
return True
|
|||
|
|
|
|||
|
|
# 递归匹配子目录
|
|||
|
|
if pattern.startswith('*/'):
|
|||
|
|
parts = str(relative_path).split(os.sep)
|
|||
|
|
for i, part in enumerate(parts):
|
|||
|
|
if fnmatch.fnmatch(part, pattern[2:]):
|
|||
|
|
return True
|
|||
|
|
|
|||
|
|
return False
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 示例文件
|
|||
|
|
```bash
|
|||
|
|
# .lobsterignore
|
|||
|
|
# 系统文件
|
|||
|
|
.DS_Store
|
|||
|
|
.Thumbs.db
|
|||
|
|
|
|||
|
|
# IDE 和编辑器
|
|||
|
|
.vscode/
|
|||
|
|
.idea/
|
|||
|
|
*.swp
|
|||
|
|
|
|||
|
|
# Python
|
|||
|
|
__pycache__/
|
|||
|
|
*.pyc
|
|||
|
|
*.log
|
|||
|
|
|
|||
|
|
# Node.js
|
|||
|
|
node_modules/
|
|||
|
|
|
|||
|
|
# 临时文件
|
|||
|
|
*.tmp
|
|||
|
|
*.bak
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### API 更新
|
|||
|
|
```http
|
|||
|
|
# 获取忽略规则列表
|
|||
|
|
GET /api/ignore/patterns/
|
|||
|
|
|
|||
|
|
# 重新加载忽略规则
|
|||
|
|
POST /api/ignore/reload/
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 📦 补丁 3: 操作溯源(Audit Log)
|
|||
|
|
|
|||
|
|
### 问题
|
|||
|
|
- 万一哪天点错了,无法查到是哪次操作导致的
|
|||
|
|
- 需要记录操作历史,方便追溯问题
|
|||
|
|
|
|||
|
|
### 解决方案
|
|||
|
|
- 新增 `SyncHistory` 模型
|
|||
|
|
- 记录每次同步操作的详细信息
|
|||
|
|
- 提供历史查询 API
|
|||
|
|
|
|||
|
|
### 实现细节
|
|||
|
|
```python
|
|||
|
|
# models.py
|
|||
|
|
class SyncHistory(models.Model):
|
|||
|
|
"""同步操作历史记录"""
|
|||
|
|
|
|||
|
|
ACTION_CHOICES = [
|
|||
|
|
('sync_to_db', '同步到数据库'),
|
|||
|
|
('sync_to_local', '同步到本地'),
|
|||
|
|
('auto_sync', '自动同步'),
|
|||
|
|
('manual_merge', '手动合并'),
|
|||
|
|
]
|
|||
|
|
|
|||
|
|
STATUS_CHOICES = [
|
|||
|
|
('success', '成功'),
|
|||
|
|
('failed', '失败'),
|
|||
|
|
('partial', '部分成功'),
|
|||
|
|
]
|
|||
|
|
|
|||
|
|
lobster_id = models.CharField(max_length=50, help_text='龙虾ID')
|
|||
|
|
file_path = models.CharField(max_length=500, help_text='文件相对路径')
|
|||
|
|
action = models.CharField(max_length=20, choices=ACTION_CHOICES, help_text='操作类型')
|
|||
|
|
status = models.CharField(max_length=20, choices=STATUS_CHOICES, help_text='操作状态')
|
|||
|
|
old_version = models.IntegerField(null=True, blank=True, help_text='操作前版本')
|
|||
|
|
new_version = models.IntegerField(null=True, blank=True, help_text='操作后版本')
|
|||
|
|
old_hash = models.CharField(max_length=64, null=True, blank=True, help_text='操作前哈希')
|
|||
|
|
new_hash = models.CharField(max_length=64, null=True, blank=True, help_text='操作后哈希')
|
|||
|
|
file_size = models.IntegerField(default=0, help_text='文件大小(字节)')
|
|||
|
|
operator = models.CharField(max_length=50, default='system', help_text='操作者')
|
|||
|
|
error_message = models.TextField(null=True, blank=True, help_text='错误信息')
|
|||
|
|
execution_time = models.FloatField(default=0, help_text='执行时间(秒)')
|
|||
|
|
created_at = models.DateTimeField(auto_now_add=True, help_text='操作时间')
|
|||
|
|
|
|||
|
|
# services.py
|
|||
|
|
class AuditLogger:
|
|||
|
|
"""操作日志记录器"""
|
|||
|
|
|
|||
|
|
def log_sync_action(
|
|||
|
|
self,
|
|||
|
|
lobster_id: str,
|
|||
|
|
file_path: str,
|
|||
|
|
action: str,
|
|||
|
|
old_version: int = None,
|
|||
|
|
new_version: int = None,
|
|||
|
|
old_hash: str = None,
|
|||
|
|
new_hash: str = None,
|
|||
|
|
file_size: int = 0,
|
|||
|
|
operator: str = 'system',
|
|||
|
|
status: str = 'success',
|
|||
|
|
error_message: str = None,
|
|||
|
|
execution_time: float = 0
|
|||
|
|
):
|
|||
|
|
"""记录同步操作"""
|
|||
|
|
self.model.objects.create(...)
|
|||
|
|
|
|||
|
|
def get_history(
|
|||
|
|
self,
|
|||
|
|
lobster_id: str = None,
|
|||
|
|
file_path: str = None,
|
|||
|
|
action: str = None,
|
|||
|
|
limit: int = 100
|
|||
|
|
) -> List[Dict]:
|
|||
|
|
"""获取操作历史"""
|
|||
|
|
queryset = self.model.objects.all()
|
|||
|
|
# 过滤和排序...
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 使用示例
|
|||
|
|
```python
|
|||
|
|
# views.py
|
|||
|
|
@api_view(['POST'])
|
|||
|
|
def sync_to_db(request):
|
|||
|
|
"""同步到数据库(带操作日志)"""
|
|||
|
|
audit_logger = AuditLogger()
|
|||
|
|
|
|||
|
|
start_time = time.time()
|
|||
|
|
|
|||
|
|
try:
|
|||
|
|
# 执行同步操作...
|
|||
|
|
execution_time = time.time() - start_time
|
|||
|
|
|
|||
|
|
# 记录成功日志
|
|||
|
|
audit_logger.log_sync_action(
|
|||
|
|
lobster_id=lobster_id,
|
|||
|
|
file_path=file_path,
|
|||
|
|
action='sync_to_db',
|
|||
|
|
old_version=old_version,
|
|||
|
|
new_version=new_version,
|
|||
|
|
old_hash=old_hash,
|
|||
|
|
new_hash=file_hash,
|
|||
|
|
file_size=record.size,
|
|||
|
|
operator=operator,
|
|||
|
|
status='success',
|
|||
|
|
execution_time=execution_time
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
except Exception as e:
|
|||
|
|
# 记录失败日志
|
|||
|
|
audit_logger.log_sync_action(
|
|||
|
|
lobster_id=lobster_id,
|
|||
|
|
file_path=file_path,
|
|||
|
|
action='sync_to_db',
|
|||
|
|
operator=operator,
|
|||
|
|
status='failed',
|
|||
|
|
error_message=str(e),
|
|||
|
|
execution_time=execution_time
|
|||
|
|
)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### API 更新
|
|||
|
|
```http
|
|||
|
|
# 获取操作历史
|
|||
|
|
GET /api/history/?lobster_id=daotong&file_path=MEMORY.md&limit=50
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 历史记录示例
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"success": true,
|
|||
|
|
"data": [
|
|||
|
|
{
|
|||
|
|
"id": 1,
|
|||
|
|
"lobster_id": "daotong",
|
|||
|
|
"file_path": "MEMORY.md",
|
|||
|
|
"action": "sync_to_db",
|
|||
|
|
"action_display": "同步到数据库",
|
|||
|
|
"status": "success",
|
|||
|
|
"status_display": "成功",
|
|||
|
|
"old_version": 1,
|
|||
|
|
"new_version": 2,
|
|||
|
|
"old_hash": "abc123...",
|
|||
|
|
"new_hash": "def456...",
|
|||
|
|
"file_size": 1234,
|
|||
|
|
"operator": "逍遥子",
|
|||
|
|
"error_message": null,
|
|||
|
|
"execution_time": 0.123,
|
|||
|
|
"created_at": "2026-04-05T12:00:00Z"
|
|||
|
|
}
|
|||
|
|
]
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 📋 数据库迁移
|
|||
|
|
|
|||
|
|
需要执行数据库迁移以创建 `SyncHistory` 表:
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# 进入后端容器
|
|||
|
|
docker exec -it lobster-backend bash
|
|||
|
|
|
|||
|
|
# 创建迁移
|
|||
|
|
python manage.py makemigrations memory_app
|
|||
|
|
python manage.py migrate
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## ✅ 完成检查清单
|
|||
|
|
|
|||
|
|
- [x] 分块读取与流式传输(services.py)
|
|||
|
|
- [x] .lobsterignore 机制(services.py + .lobsterignore.example)
|
|||
|
|
- [x] 操作溯源(models.py + services.py + views.py + serializers.py)
|
|||
|
|
- [x] 新增 API 接口(urls.py)
|
|||
|
|
- [x] 更新文档(CHANGELOG.md)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 🚀 下一步
|
|||
|
|
|
|||
|
|
1. 执行数据库迁移
|
|||
|
|
2. 推送代码到远程仓库
|
|||
|
|
3. 更新前端界面(添加历史记录和忽略规则管理)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
**感谢逍遥子的宝贵建议!** 🙏
|