darknet_video刪改

逐行解釋
Sure, here's a line-by-line explanation:

def convert2original(image, bbox):
    x, y, w, h = convert2relative(bbox)

    image_h, image_w, __ = image.shape

    orig_x       = int(x * image_w)
    orig_y       = int(y * image_h)
    orig_width   = int(w * image_w)
    orig_height  = int(h * image_h)

    bbox_converted = (orig_x, orig_y, orig_width, orig_height)

    return bbox_converted

This function takes in an image and a bounding box bbox defined as a tuple of (x, y, width, height), converted to relative coordinates using the convert2relative() function. It then converts the relative coordinates back into absolute pixel coordinates for the original image and returns the new bounding box as a tuple of (x, y, width, height).

def convert4cropping(image, bbox):
    x, y, w, h = convert2relative(bbox)

    image_h, image_w, __ = image.shape

    orig_left    = int((x - w / 2.) * image_w)
    orig_right   = int((x + w / 2.) * image_w)
    orig_top     = int((y - h / 2.) * image_h)
    orig_bottom  = int((y + h / 2.) * image_h)

    if (orig_left < 0): orig_left = 0
    if (orig_right > image_w - 1): orig_right = image_w - 1
    if (orig_top < 0): orig_top = 0
    if (orig_bottom > image_h - 1): orig_bottom = image_h - 1

    bbox_cropping = (orig_left, orig_top, orig_right, orig_bottom)

    return bbox_cropping

This function takes in an image and a bounding box bbox defined as a tuple of (x, y, width, height), converted to relative coordinates using the convert2relative() function. It then calculates the absolute pixel coordinates for the top-left and bottom-right corners of the bounding box, but with additional padding on each side, allowing for easier cropping of the detected object from the image later.

def video_capture(frame_queue, darknet_image_queue):
    while cap.isOpened():
        ret, frame = cap.read()
        if not ret:
            break
        frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        frame_resized = cv2.resize(frame_rgb, (darknet_width, darknet_height),
                                   interpolation=cv2.INTER_LINEAR)
        frame_queue.put(frame)
        img_for_detect = darknet.make_image(darknet_width, darknet_height, 3)
        darknet.copy_image_from_bytes(img_for_detect, frame_resized.tobytes())
        darknet_image_queue.put(img_for_detect)
    cap.release()

This function is run in a separate thread, and reads successive frames from the capture device (e.g. webcam or video file). For each frame read, it first converts the color space from BGR to RGB and resizes it to match the input size expected by the YOLOv3 model. It then adds the original frame to a queue for processing by subsequent threads, and creates a Darknet image object in memory for inference by the YOLOv3 algorithm, adding this to another queue for processing by subsequent threads. Once all frames are processed, the capture device is released.

def inference(darknet_image_queue, detections_queue, fps_queue):
    while cap.isOpened():
        darknet_image = darknet_image_queue.get()
        prev_time = time.time()
        detections = darknet.detect_image(network, class_names, darknet_image, thresh=args.thresh)
        detections_queue.put(detections)
        fps = int(1/(time.time() - prev_time))
        fps_queue.put(fps)
        print("FPS: {}".format(fps))
        darknet.print_detections(detections, args.ext_output)
        darknet.free_image(darknet_image)
    cap.release()

This function is also run in a separate thread, and processes each Darknet image object passed to it from the queue by running inference with the YOLOv3 algorithm on that image. It then adds the detected objects and their associated bounding boxes to a separate queue for processing by subsequent threads, calculates the frames per second (FPS) of inference, prints this to the console, and releases the memory allocated to the Darknet image object.

def drawing(frame_queue, detections_queue, fps_queue):
    random.seed(3)  # deterministic bbox colors
    video = set_saved_video(cap, args.out_filename, (video_width, video_height))
while cap.is




PaddleOCR
To perform character recognition on the detected bounding boxes using PaddleOCR and output the recognition result, you can modify the `drawing()` function to include another thread that takes in detections from the queue and performs OCR on the cropped images within those bounding boxes. Here's an example of what this modified function might look like:

import paddleocr

def drawing(frame_queue, detections_queue, fps_queue):
random.seed(3) # deterministic bbox colors
video = set_saved_video(cap, args.out_filename, (video_width, video_height))

# Start OCR thread
ocr_queue = Queue()
Thread(target=perform_ocr, args=(detections_queue, ocr_queue)).start()

while cap.isOpened():
    frame = frame_queue.get()
    detections = detections_queue.get()
    ocr_results = ocr_queue.get()
    fps = fps_queue.get()
    detections_adjusted = []
    if frame is not None:
        for label, confidence, bbox in detections:
            bbox_adjusted = convert2original(frame, bbox)
            detections_adjusted.append((str(label), confidence, bbox_adjusted))
            
            # Extract cropped image and perform OCR
            bbox_cropped = convert4cropping(frame, bbox)
            cropped_image = frame[bbox_cropped[1]:bbox_cropped[3], bbox_cropped[0]:bbox_cropped[2]]
            ocr_text = ocr_results.get(str(bbox_adjusted))
            if ocr_text is None:
                ocr_text = perform_paddleocr(cropped_image)
                ocr_results[str(bbox_adjusted)] = ocr_text
        
        image = darknet.draw_boxes(detections_adjusted, frame, class_colors)
        if not args.dont_show:
            cv2.imshow('Inference', image)
        if args.out_filename is not None:
            video.write(image)
        if cv2.waitKey(fps) == 27:
            break
cap.release()
video.release()
cv2.destroyAllWindows()

def perform_ocr(detections_queue, ocr_queue):
ocr_results = {}
while cap.isOpened():
detections = detections_queue.get()
for label, confidence, bbox in detections:
bbox_adjusted = convert2original(frame, bbox)

        # Extract cropped image and perform OCR
        bbox_cropped = convert4cropping(frame, bbox)
        cropped_image = frame[bbox_cropped[1]:bbox_cropped[3], bbox_cropped[0]:bbox_cropped[2]]
        ocr_text = perform_paddleocr(cropped_image)
        ocr_results[str(bbox_adjusted)] = ocr_text
        
    ocr_queue.put(ocr_results)

def perform_paddleocr(image):
ocr_result = ''
ocr_apis = paddleocr.OCR(det_model_dir='ch', rec_model_dir='ch')
result = ocr_apis.ocr(image)
for line in result:
for word in line:
ocr_result += word[1]
return ocr_result


In this modified `drawing()` function, we start a new thread by calling `Thread(target=perform_ocr, args=(detections_queue, ocr_queue)).start()` right after initializing the `ocr_queue`. The `perform_ocr()` function takes in the same `detections_queue` as the `drawing()` function, but it also has an additional queue called `ocr_queue` that is used to store OCR results for each detected bounding box. 

Inside the `drawing()` function's main while loop, we add code to extract the cropped image from each bounding box using the `convert4cropping()` function (similar to the code for drawing bounding boxes). We then check if there is already an OCR result stored for this bounding box in the `ocr_results` dictionary. If not, we use PaddleOCR to perform character recognition on the cropped image and save the result. We then add this result back to the `ocr_results` dictionary with the bounding box coordinates as the key.

Finally, we pass the `ocr_results` dictionary to the `ocr_queue` for processing by the OCR thread. The OCR thread takes detections from the `detections_queue`, extracts and performs OCR on each detected bounding box using PaddleOCR, and stores the results in the `ocr_results` dictionary. It then passes this updated dictionary back to the main thread via the `ocr_queue`, allowing the main thread to access OCR results for each bounding box and draw them on the original image frame.

Note that in this example, we assume that you have already installed PaddleOCR and downloaded a pre-trained model for Chinese character recognition (`det_model_dir='ch', rec_model_dir='ch'`). You may need to adjust this code if you are using a different model or language.








tesseract

To perform character recognition on the detected bounding boxes, you can modify the `drawing()` function to include another thread that takes in detections from the queue and performs optical character recognition (OCR) on the cropped images within those bounding boxes. Here's an example of what this modified function might look like:

import pytesseract

def drawing(frame_queue, detections_queue, fps_queue):
random.seed(3) # deterministic bbox colors
video = set_saved_video(cap, args.out_filename, (video_width, video_height))

# Start OCR thread
ocr_queue = Queue()
Thread(target=perform_ocr, args=(detections_queue, ocr_queue)).start()

while cap.isOpened():
    frame = frame_queue.get()
    detections = detections_queue.get()
    ocr_results = ocr_queue.get()
    fps = fps_queue.get()
    detections_adjusted = []
    if frame is not None:
        for label, confidence, bbox in detections:
            bbox_adjusted = convert2original(frame, bbox)
            detections_adjusted.append((str(label), confidence, bbox_adjusted))
            
            # Extract cropped image and perform OCR
            bbox_cropped = convert4cropping(frame, bbox)
            cropped_image = frame[bbox_cropped[1]:bbox_cropped[3], bbox_cropped[0]:bbox_cropped[2]]
            ocr_text = ocr_results.get(str(bbox_adjusted))
            if ocr_text is None:
                ocr_text = pytesseract.image_to_string(cropped_image)
                ocr_results[str(bbox_adjusted)] = ocr_text
        
        image = darknet.draw_boxes(detections_adjusted, frame, class_colors)
        if not args.dont_show:
            cv2.imshow('Inference', image)
        if args.out_filename is not None:
            video.write(image)
        if cv2.waitKey(fps) == 27:
            break
cap.release()
video.release()
cv2.destroyAllWindows()

def perform_ocr(detections_queue, ocr_queue):
ocr_results = {}
while cap.isOpened():
detections = detections_queue.get()
for label, confidence, bbox in detections:
bbox_adjusted = convert2original(frame, bbox)

        # Extract cropped image and perform OCR
        bbox_cropped = convert4cropping(frame, bbox)
        cropped_image = frame[bbox_cropped[1]:bbox_cropped[3], bbox_cropped[0]:bbox_cropped[2]]
        ocr_text = pytesseract.image_to_string(cropped_image)
        ocr_results[str(bbox_adjusted)] = ocr_text
        
    ocr_queue.put(ocr_results)

In this modified `drawing()` function, we start a new thread by calling `Thread(target=perform_ocr, args=(detections_queue, ocr_queue)).start()` right after initializing the `ocr_queue`. The `perform_ocr()` function takes in the same `detections_queue` as the `drawing()` function, but it also has an additional queue called `ocr_queue` that is used to store OCR results for each detected bounding box. 

Inside the `drawing()` function's main while loop, we add code to extract the cropped image from each bounding box using the `
[Request aborted]


我自認(rèn)為在政治思想方面,我始終堅(jiān)持以習(xí)近平新時(shí)代中國(guó)特色社會(huì)主義思想為指導(dǎo),認(rèn)真履行自己的公民責(zé)任,在日常生活中積極踐行社會(huì)主義核心價(jià)值觀。同時(shí),我也深刻認(rèn)識(shí)到個(gè)人思想覺(jué)悟水平還需不斷提高,需要更加深入學(xué)習(xí)和理解馬克思列寧主義、毛澤東思想和鄧小平理論等。
在學(xué)習(xí)科研方面,我一直保持著良好的學(xué)習(xí)態(tài)度和求知欲,認(rèn)真聽(tīng)取老師的講解,積極參與討論,并且勇于拓展自己的思維,多角度思考問(wèn)題。我在大學(xué)期間也積極參與各種科研項(xiàng)目,在實(shí)踐中提升了自己的動(dòng)手實(shí)踐能力和科學(xué)研究能力。
在社會(huì)活動(dòng)方面,我始終堅(jiān)持以服務(wù)于他人為本,積極參與各類(lèi)志愿者活動(dòng),如組織學(xué)生參與校外公益活動(dòng)、參與社區(qū)文化建設(shè)等。這些活動(dòng)不僅鍛煉了我的組織管理能力和溝通協(xié)調(diào)能力,還增強(qiáng)了我對(duì)社會(huì)的責(zé)任感和使命感。
在能力特長(zhǎng)方面,我有一定的編程技能和語(yǔ)言表達(dá)能力。我熱愛(ài)計(jì)算機(jī)編程,熟練掌握多種編程語(yǔ)言,對(duì)數(shù)據(jù)分析等方面也有一定的理解和應(yīng)用能力。同時(shí),我也經(jīng)常參加英語(yǔ)演講比賽和辯論賽,在鍛煉自己的英語(yǔ)口語(yǔ)表達(dá)能力的同時(shí),也提高了自己的邏輯思考和分析能力。
在取得的成就方面,我曾獲得全國(guó)大學(xué)生英語(yǔ)競(jìng)賽二等獎(jiǎng)、校園科技節(jié)創(chuàng)新創(chuàng)業(yè)大賽三等獎(jiǎng)、優(yōu)秀志愿者稱號(hào)等榮譽(yù)。這些成就是我不斷努力和付出的結(jié)果,也給予我更多信心和動(dòng)力去追求更高的目標(biāo)。
在存在的問(wèn)題方面,我認(rèn)為自己還存在很多不足和需要改進(jìn)的地方。例如,我的時(shí)間管理能力還需要提升,有時(shí)候容易陷入拖延和浪費(fèi)時(shí)間的狀態(tài)。同時(shí),我的社交能力還有待提高,需要更好地處理人際關(guān)系,并且嘗試與更多不同背景的人交流互動(dòng)。
今后,我將繼續(xù)擴(kuò)大自己的知識(shí)儲(chǔ)備,深入學(xué)習(xí)和研究各個(gè)領(lǐng)域的知識(shí),提高自己的綜合素質(zhì)。同時(shí),我也將更加注重實(shí)踐能力的提升,在項(xiàng)目實(shí)踐中不斷鍛煉自己的動(dòng)手實(shí)踐能力和團(tuán)隊(duì)合作能力。在社會(huì)活動(dòng)方面,我將積極參與各類(lèi)志愿者活動(dòng),并且嘗試組織一些有意義的活動(dòng)來(lái)回饋社會(huì)。最后,我也希望自己可以更好地發(fā)掘自己的潛力,盡快找到適合自己的職業(yè)方向,為國(guó)家和社會(huì)做出自己的貢獻(xiàn)。




Day2
Section 1
To draw a box around the detections and print OCR results in the box, you can modify the `perform_ocr` function as follows:

```python
def perform_ocr(detections_queue, ocr_queue):
    ocr_results = {}
    while cap.isOpened():
        frame = frame_queue.get()
        ocr_results = ocr_queue.get()
        detections = detections_queue.get()
        for label, confidence, bbox in detections:
            bbox_adjusted = convert2original(frame, bbox)
            detections_adjusted.append((str(label), confidence, bbox_adjusted))
            # Extract cropped image and perform OCR
            bbox_cropped = convert4cropping(frame, bbox)
            cropped_image = frame[bbox_cropped[1]:bbox_cropped[3], bbox_cropped[0]:bbox_cropped[2]]
            ocr_text = perform_paddleocr(cropped_image)
            ocr_results[str(bbox_adjusted)] = ocr_text
            
            # Draw box and print OCR results
            cv2.rectangle(frame, (bbox_adjusted[0], bbox_adjusted[1]), (bbox_adjusted[2], bbox_adjusted[3]), (255, 0, 0), 2)
            cv2.putText(frame, ocr_text, (bbox_adjusted[0], bbox_adjusted[1]-10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255,0,0), 1, cv2.LINE_AA)
        
        detections_adjusted = [] # reset detections for next iteration
        ocr_queue.put(ocr_results)
        
        if not args.dont_show:
            cv2.imshow('Inference', frame)
        if args.out_filename is not None:
            video.write(frame)
        if cv2.waitKey(fps) == 27:
            break
        
    cap.release()
    video.release()

This modified code adds two lines to draw a rectangle around the detected object and print OCR results inside the rectangle. The cv2.rectangle function takes the frame, top-left and bottom-right coordinates of the rectangle, color, and thickness as arguments. The cv2.putText function takes the frame, text to display, position of the text, font type and size, color, thickness, and line type as arguments.

Section 2
The line ocr_text = ocr_results.get(str(bbox_adjusted)) retrieves the OCR results for the current detection bbox from the ocr_results dictionary.
The str(bbox_adjusted) is used as the dictionary key because it provides a unique identifier for each detection bbox, which can be used to associate the correct OCR result with the corresponding detection bbox. The bbox_adjusted contains the coordinates of the top-left and bottom-right corners of the detection bbox after adjusting for any image scaling or resizing that may have been applied during the detection process.
The get() method is called on the ocr_results dictionary with str(bbox_adjusted) as its argument. If a matching key exists in the dictionary, the corresponding OCR text value is returned and assigned to the ocr_text variable. If no matching key is found, the get() method returns None.

Section3
The line ocr_results = ocr_queue.get() retrieves the OCR results dictionary from the ocr_queue.
The ocr_queue is a Python Queue object that is used to pass data between different threads in a synchronized way. In this code, it is used to pass the OCR results dictionary from the thread that performs OCR on each detection frame to the main thread that displays the detected objects with their OCR results.
The get() method of the ocr_queue blocks until an item is available in the queue. Once an item is available, it removes and returns it from the queue. When the OCR thread puts the OCR results dictionary into the queue using ocr_queue.put(ocr_results), the get() method of the main thread can retrieve it for further processing.

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書(shū)系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容