使用另一个文本文件的单词列表顺序替换单词 - data_file

Question

在我的脑海里我有：

阅读"original_file", 将第 3 行 "ENTRY1" 更改为 data_file 中第一个单词的行。写出new_file1。阅读 "original_file", 将第 3 行 "ENTRY1" 更改为 data_file 中第二个单词的行。写出 new_file2

重复整个 data_file。

excerpt/example:

original_file:

    line1      {
    line2        "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c",
    line3        "name": "ENTRY1",
    line4        "auto": true,
    line5        "contexts": [],
    line6        "responses": [
    line7      {
    ------------

    data_file:(simply a word/number List)
    line1   AAA11
    line2   BBB12
    line3   CCC13
    ..100lines/Words..
    -------------

    *the First output/finished file would look like:
    newfile1:
    line1      {
    line2        "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c",
    line3        "name": "AAA11",
    line4        "auto": true,
    line5        "contexts": [],
    line6        "responses": [
    line7      {
    ------------
    and the Second:
    newfile2:
    line1      {
    line2        "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c",
    line3        "name": "BBB12",
    line4        "auto": true,
    line5        "contexts": [],
    line6        "responses": [
    line7      {
    ------------

..等等。

我一直在尝试使用 sed，比如

awk 'FNR==$n1{if((getline line < "data_file") > 0) fprint '/"id:"/' '/""/' line ; next}$n2' < newfile

并且..作为shell脚本的开始..

#!/bin/bash
n1=3
n2=2
sed '$n1;$n2 data_file' original_file > newfile

如有任何帮助，我们将不胜感激。我一直在尝试将在 SO 上找到的各种技术粘合在一起。一次做一件事。学习如何替换。。然后从第二个文件替换..但它超出了我的知识。再次感谢。我的 data_file 中大约有 31,000 行..所以这是必要的.. （要自动化）。这是一次性的事情，但可能对其他人非常有用？

Answer 1

在Python 2.+:

输入：

more original_file.json data_file 
::::::::::::::
original_file.json
::::::::::::::
{
  "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c",
  "name": "ENTRY1",
  "auto": true,
  "contexts": [],
  "responses": []
}
::::::::::::::
data_file
::::::::::::::
AAA11
BBB12
CCC13

python 脚本：

import json

#open the original json file
with open('original_file.json') as handle:
  #create a dict based on the json content
  dictdump = json.loads(handle.read())
  #file counter
  i = 1
  #open the data file
  f = open("data_file", "r")
  #get all lines of the data file
  lines = f.read().splitlines()
  #close it
  f.close()
  #for each line of the data file
  for line in lines:
    #change the value of the json name element by the current line
    dictdump['name'] = line
    #open newfileX
    o = open("newfile" + str(i),'w')
    #dump the content of modified json
    json.dump(dictdump,o)
    #close the file
    o.close()
    #increase the counter value
    i += 1

输出：

more newfile*
::::::::::::::
newfile1
::::::::::::::
{"contexts": [], "auto": true, "responses": [], "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c", "name": "AAA11"}
::::::::::::::
newfile2
::::::::::::::
{"contexts": [], "auto": true, "responses": [], "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c", "name": "BBB12"}
::::::::::::::
newfile3
::::::::::::::
{"contexts": [], "auto": true, "responses": [], "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c", "name": "CCC13"}

如果你需要垂直输出json转储，你可以调整行： json.dump(dictdump, o) 变成 json.dump(dictdump, o, indent=4)。这将产生：

more newfile*
::::::::::::::
newfile1
::::::::::::::
{
    "contexts": [], 
    "auto": true, 
    "responses": [], 
    "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c", 
    "name": "AAA11"
}
::::::::::::::
newfile2
::::::::::::::
{
    "contexts": [], 
    "auto": true, 
    "responses": [], 
    "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c", 
    "name": "BBB12"
}
::::::::::::::
newfile3
::::::::::::::
{
    "contexts": [], 
    "auto": true, 
    "responses": [], 
    "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c", 
    "name": "CCC13"
}

文档：https://docs.python.org/2/library/json.html

更新版本以保持与输入相同的顺序：

import json
from collections import OrderedDict

#open the original json file
with open('original_file.json') as handle:
  #create a dict based on the json content
  dictdump = json.loads(handle.read(), object_pairs_hook=OrderedDict)
  #file counter
  i = 1
  #open the data file
  f = open("data_file", "r")
  #get all lines of the data file
  lines = f.read().splitlines()
  #close it
  f.close()
  #for each line of the data file
  for line in lines:
    #change the value of the json name element by the current line
    dictdump['name'] = line
    #open newfileX
    o = open("newfile" + str(i),'w')
    #dump the content of modified json
    json.dump(dictdump, o, indent=4)
    #close the file
    o.close()
    #increase the counter value
    i += 1

输出：

more newfile*
::::::::::::::
newfile1
::::::::::::::
{
    "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c", 
    "name": "AAA11", 
    "auto": true, 
    "contexts": [], 
    "responses": []
}
::::::::::::::
newfile2
::::::::::::::
{
    "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c", 
    "name": "BBB12", 
    "auto": true, 
    "contexts": [], 
    "responses": []
}
::::::::::::::
newfile3
::::::::::::::
{
    "id": "b5902627-0ba0-40b6-8127-834a3ddd6c2c", 
    "name": "CCC13", 
    "auto": true, 
    "contexts": [], 
    "responses": []
}

Answer 2

假设我们正在尝试更改某些 JSON 数据中的 'name' 并且新值将是纯字母数字（以便双引号正常工作）：

#!/bin/bash

n=1
cat data_file | while read value; do
    jq <original_file >"newfile$n" ".name = \"$value\""
    ((n++))
done

使用另一个文本文件的单词列表顺序替换单词 - data_file

sequentially replace a word using another text file list of words - data_file

regex

awk

grep

cat